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Progress in the measurement of those vague, non-intellectual 
characteristics of human behavior frequently grouped around the 
term “personality”? has been encouraged both by those who have 


found intelligence tests very useful, and by those who have found 
intelligence tests too limited in scope to serve their purposes. At 
the present time techniques are developing so rapidly, and interest is 
being aroused on the part of so many workers, that an inventory 
seems necessary at least once a year. In November, 1924, Symonds 
published in The Journal of Educational Psychology his summary 
called, ‘‘The Present Status of Character Tests.”” He there presented 
the available evidence on two habit scales, the findings from at least 
ten studies on trait scales, and data on eight or ten tests of moral 
qualities. Even more complete was the summary presented by 
Hartshorne and May in The Pedagogical Seminar and Journal of 
Genetic Psychology for March, 1925. This excellent review, with its 
annotated bibliography of 68 titles dealing with the objective measure- 
ment of character is now in need of supplementation at a few points.’ 
It purposely excluded rating scales, word-association methods and 
physiological methods of testing. Certain interesting tests, only 
indirectly related to moral character, were not included. Moreover, 





1 This review was prepared in January, 1926, previous to the publication of the 
more complete bibliographies by May and Hartshorne (Psychological Bulletin, 
July, 1926) and by Manson (Reprint and Circular Series of the National Research 
Council No. 72, $1.00). The latter lists 1364 references up to 1926. In conse- 
quence the bibliography prepared with this article has been omitted. A summary 
now being prepared by the writer notes more than one hundred additional refer- 
ences, published during 1926, dealing with character measurement. 
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since its publication the wealth of available materials has been 
enlarged, Hartshorne and May having been among the principal 
contributors to this progress. This review will endeavor tosupplement 
these previous studies by summarizing other data on rating scales, 
physiological measures, questionnaires and tests. Repetition of 
much of the excellent material contained in previous summaries will 
be avoided. While endeavor has been made to cover the contribu- 
tions to this field of measurement as completely as possible, no claim 
can be made to exhaustive treatment. 

First, let us ask, ‘What do we know about rating scales?” The 
following findings have been established more or less firmly through 
the experimental work of the last 20 years. 

1. People differ markedly in their ability to make ratings (Nors- 
worthy, Rugg and Paterson). 

2. People differ in their reliability as subjects for ratings. Some 
are easier to rate than others. It appears that poor employees tend 
to be better analyzed than are good ones (Norsworthy, Rugg and 
Kingsbury). 

3. Traits differ in the success with which they can be rated. In 
general it seems desirable that ratings be based upon past or present 
accomplishment, that they be as objective as possible, that they be 
stated unambiguously and specifically (Paterson and Kingsbury). 

4. It is desirable to have traits defined. This definition should be 
as simple as possible, but unambiguous, definite, objective (Paterson). 

5..There is a tendency to skew the rating of every specific trait 
in the direction of the total reaction of the rater to subject. This is 
the well-authenticated ‘“‘halo effect.” Knight found a correlation of 
.94 between ratings on ‘quality of voice” and “moral stamina”’ 
(Thorndike, Rugg, Knight and Franzen). 

6. Raters having one form of contact with the individual being 
rated (teachers of the same school subject), tend to agree more closely 
than do raters with more diversified contacts. By the same token, 
ratings obtained from persons having predominantly one type of 
contact are much less useful outside of that specific field (Hanna). 

7. The average or median rating of a number of judges is superior 
to that of a single judge, provided there are not great differences in the 
capability of the judges (Rugg, Paterson and Gordon). 

8. Rating scales to be used in ordinary situations, should be simply 
stated, and capable of being used easily (Paterson). 

9. Raters should be given training (Rugg and Kingsbury). 
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10. There is no significant difference between the results obtained 
by scales which demand that the rater shall rank the subjects in 
order of merit, and scales which provide a range of values which may be 
assigned each person. The latter is more congenial to most raters 
(Symonds). 

11. There is some evidence that immediate emotional reactions 
affect ratings made upon the “scale of values” method more than 
they do ratings made when subjects are ranked = in order of merit 
(Conklin and Sutherland). 

12. Statistically considered, seven seems to be the optimum 
number of intervals for scaling behavior (Symonds). 

13. The man-to-man scale, or “human ladder” has many advan- 
tages in securing desirable distributions and comparability of ratings 
(Scott). 

14. The graphic rating scale, in which the rater places a check 
upon a line rather than using statistical terms, has advantages in 
permitting fine discriminations and in being congenial to raters. 
Adjectives are usually placed along the line to indicate the meaning of 
sections of the line. Such scales should be at least five inches long, 
no breaks or divisions should be made in the line, the extremes and 
one to three other points should be defined in terms of universally 
understood words which are not too general in scope, and the favorable 
extremes should be alternated to correct the motor tendency (Freyd). 

15. The scale should, ordinarily, yield a normal distribution. If 
it does not, this may be statistically corrected. Individuals who rate 
constantly low or high should have their ratings corrected (Freyd, 
Kelly and Paterson). 

16. One trait should be rated through the entire group of subjects, 
rather than permitting the rating of one subject through the entire 
group of traits (Symonds and Paterson). 

17. A graphic scale which gives one sheet for each trait, indicating 
over each of the five or seven sections of the line-graph the approximate 
number or per cent of the group who should be given ratings in that 
general vicinity, tends toward a more widespread and normal series of 
ratings (Symonds). 

18. Self-ratings tend to be too high on desirable traits and too 
low on undesirable traits. They tend, however, to place the strong 
and weak points of the individual in their general positions. “One 
tends to rate one’s own sex higher than the opposite sex on desirable 
traits, the reverse being true of undesirable traits (Knight, Franzen, 
Kinder and Shen). 
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19. People who are good judges of themselves tend to be good 
judges of others. 

20. While close associates are likely to rate more reliably than are 
casual associates, long and intimate friendships bring marked decreases 
in the reliability of ratings. Persons tend to over-rate intimate 
friends on desirable traits and under-rate less desirable traits (Knight 
and Shen). 

21. ‘‘General all around value”’ is frequently more reliably rated 
than are some of the more specific qualities involved (Rugg and 
Slawson). 

22. Ratings become more reliable when a general trait (e.g., 
developmental age) is broken into a number (18) of specific factors 
(Furfey). 

23. Ratings of which the rater expresses himself as “very sure”’ 
are markedly more reliable than are ordinary ratings (Cady). 

24. Raters are frequently unable to justify ratings, or are apt to 
give absurd rationalizations. This does not, however, indicate any- 
thing about the reliability of the rating (Landis). 

25. Judges who have been asked to observe for several months, 
preparatory to rating, presumably give better ratings than do judges 
whose observation has been more or less casual (Webb). 

The reliability of ratings has been studied by almost every writer, 
with varying results. Some rating scales in which many of the above 
findings have been utilized seem to have reliabilities surpassing most 
existing group tests. Thus reliabilities found by Barr range from .40 
to .80; by Freyd from .52 to .87; by Webb .65 to .81; by Knight and 
Cleeton from .80 to .90; by Shen from .62 to .91 and by Furfey from 
.70 to .94 or .97. Such reliabilities are in decided contrast to Rugg’s 
results with the army scale, and probably will surprise many who have 
been inclined to hold the pretensions of rating scales as scientific 
instruments of research in more or less contempt. 

Among the best studies made by the use of rating scales is that of 
Webb, who studied two groups of college men consisting of 96 each, 
and four groups of school boys containing 35 each. Carefully building 
up reliable and valid ratings, he used them for an analysis of character. 
He found evidence to substanfiate a general factor of intellectual 
energy and sought for a similar factor in the field of character, seeming 
to identify it with “persistence of motive.’’ Jones has used rating 
scales for predicting the teaching success of college students and has 
found them more useful than grades or intelligence tests. Freyd and 
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Brubacher have also developed rating schemes for teachers. Branden- 
burg found ratings correlating with money-success after college better 
than did intelligence. Pressey found that ratings on ‘“‘school attitude” 
predicted marks better than did intelligence test scores. Chassell has 
studied citizenship habits in kindergarten and elementary school by 
rating scales, obtaining significant reliabilities. Porteus found that an 
eight-trait scale correlated .88 with estimates of the social fitness of 
defectives. The applications of rating scales to business are far too 
numerous to mention here. Miss Manson has listed 50 titles dealing 
with rating scales in industry, in her “Bibliography of Psychological 
Tests and Other Objective Measures in Industrial Personnel.”’ 
Turning now to the second division of the field; what physiological 
measures seem to have significance for personality traits? Studies 
have been made of the galvanic reflex, breathing, blood pressure, 
pulse beat, and reaction time. Brown found the galvanic reflex to 
correlate .32 with teacher’s ratings on ‘‘desire to excel” but —.08 
with similar ratings on emotionality. Wechsler found a high cor- 
relation between galvanic reflex records and introspective affective 
ratings, and believes the instrument useful in distinguishing between 
abnormal conditions of similar appearance, e.g., manic-depressive 
stupor and catatonic dementia precox. Marston found the galvanom- 
eter too sensitive, overdoing the reaction so that he could not differ- 
entiate between consciousness of truth and of fraud. Washburn and 
her associates found that in a group selected as “ cheerful’ 40 per cent 


' gave over a 10 per cent reaction on the galvanometer, whereas of the 


group selected as depressed, only 4 per cent showed such galvanic 
reactions to the association tests submitted. She found no difference 
between pleasant and unpleasant associations in terms of the galvanic 
response. 

Benussi studied the inspiration-expiration ratio, and found that 
in truth-telling the subtraction of the ratios after speaking from those 
before speaking, gave a positive result, the reverse being true for 
lying. Burtt, studying the same technique, found it unsatisfactory, 
except in the case of an imaginary crime reported to a jury, in which 
case it worked in 73 per cent of the cases. Systolic blood pressure he 
found, however, to work in 91 per cent of the cases. He recommends 
a combination of both. Marston believes systolic blood-pressure the 
best criterion for deception because it eliminates the local effects of 
minor affective states, the irrelevant factors of intellectual work, 
variations due to minor bodily pains, and registers unequivocal 
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changes. He found that consciousness of deception could be ade- 
quately measured in 103 out of 107 cases. Larson improved the 
technique, taking continuous sphygmograph, systolic blood pressure, 
heart-beat, and breathing curve. 

Goldstein studied reaction times in the performance of simple 
addition or subtraction, and found that consciousness of deception 
tended to increase the time. The writer studied reaction time in 
connection with crossing out nouns in materials which might be 
supposed to bring several different emotional states. The group (20 
theological students) was too small to be significant, but no marked 
differences were found, after due allowances were made for other 
variables, except in the case of material which seemed sexually stimulat- 
ing, in which case the reaction time was enormously lengthened. 
Burtt, studying 43 students in agricultural engineering, measured 
interest in material in terms of the number of irrelevant words, 
scattered through the material, which the subject could cross out in a 
given time. He believed that more interest in the material might lead 
to less careful attention to irrelevant words. He found a correlation 
of .30 between this measure of interest, and success in the professional 
application of the interest. Landis, Gillette and Jacobson studied the 
emotionality of 25 subjects with a battery of 19 tests and ratings. The 
most satisfactory criterion seemed to be pictures of the subjects, 
facial expression and head movements. The amount of laughter also 


seemed to be good as a measure of emotional expressiveness. The. 


Woodworth questionnaire correlated fairly well (.12 to .57) with their 
criterion. The Pressey idiosyncrasy seemed a better measure than 
Pressey affectivity. Blood pressure seemed very poor, but increased 
reaction time fairly good. 

Questionnaires and tests have been much more completely sum- 
marized by Hartshorne and May in the article on ‘‘ Objective Methods 
of Measuring Character.”” The authors mention 10 tests purporting 
to measure ethical, moral, social, and religious attitudes, 27 tests of 
personality traits running from “aggressiveness” to ‘will temper- 
ament,”’ 7 tests of interests, attitude, and prejudice, and 5 tests of 
instinct and emotion. These tests are analyzed with reference to the 
sort of reaction they expect from the subject, the scoring system, and 
methods of measuring reliability and validity. The better known 
tests, such as the Pressey X-0 test for the detection of emotional 
instability, the Brotmarkle comparison test for the measurement, of 
the subject’s understanding of moral terms in their conventional signif- 
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icance, the Koh’s test of ethical discrimination, the Hart test of social 
attitudes and interests, the Woodworth-Mathews questionnaire, the 
Upton-Chassell citizenship scale, are of course included. The Downey 
will-temperament tests received special attention from May in an article 
summarizing the result of six or eight previous studies. The conclu- 
sions are uniformly disappointing. Reliabilities are very low, running 
in different investigations from .05 to .90, —.23 to .65, —.05 to .37, 
—.16 to .01, —.18 to .24, with the median for all traits and all investi- 
gations probably not lower than .05 nor higher than .40. Validity in 
relationship to ratings seems to range from —.65 to .54, with median 
probably between .00 and .25. Miner recently selected two groups of 
college students every one of whom showed one sigma or more dis- 
parity between IQ as measured by two tests, and scholarship. Pro- 
fessor Downey consented to try to separate by will-temperament 
profiles, the group in which scholarship would probably be higher 
than IQ from those in whom scholarship would probably be lower 
than IQ. She was right, as chance would provide, in 50 per cent of 
the cases. 

Perhaps the best of the tests reported by Hartshorne and May 
are the Voelker tests of trustworthiness and the conduct tests which 
Cady developed in the same general field. All of these tests are 
records of behavior in actual life-situations in which there are oppor- 
tunities for over-statement, cheating, taking or keeping money, 
failure to carry out responsibilities, etc. It does not detract from 
the value of the individual tests, but only questions the probable 
unity of the trait called “trustworthiness” that this writer finds a 
correlation only .20 between the score of boys on the “‘odd”’ tests of 
Voelker’s battery and the ‘“‘even”’ tests of the same battery. Voelker 
reports a correlation of from .21 to .85 between the first battery and 
the second which contains paired tests. 

One of the most complete studies mentioned by Hartshorne and 
May is Otis’s, ‘A Study of the Suggestibility of Children.” Building 
on 19 previous investigations, Miss Otis has developed group tests of 
suggestibility, having self-correlations of .41 to .67 and correlating 
non-suggestibility with intelligence at .75. 

Tests in which information of some sort is considered significant 
of personality traits are best illustrated, perhaps, by Ream’s study 
of interests in terms of information about games, lodges, songs, hymns, 
politics, dances, etiquette, slang, sport, dice, poker, billiards, roulette, 
the stock-market, and the police gazette. 
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Other tests depend upon the arrangement of alternatives by 
pupils. Chapman suggests a test of motives, based on the arrange- 
ment which pupils make of suggested reasons for going to high school, 
for saving money, and for reading good literature. He found pupil 
arrangements for different individuals correlating with the statistically 
correct order, some as low as —.45, others as high as .95. He found 
steady progress from Grade VI through to graduate students. Fernald 
studied delinquents in a similar fashion using offences, meritorious 
acts, and ambitions, arranged in order of desirability of act to gravity 
of offence, by 15 competent adults. 

Decision types have been measured by Bridges on the basis of 
tests of constancy of preference when two stimuli were simultaneously 
presented. The subject indicated which he liked better by raising 
the hand on the corresponding side. Decisions covered a wide range of 
interests, e.g., virtues, music, literature, fruit, inventions, animals, 
painting, modes of transportation, etc. The time taken to make a 
choice was measured. Accuracy of decision was measured in terms 
of right decisions as to which of two perforated cards had the more 
holes in it, each being too large a number to count. Suggestibility 
was measured by choices agreeing with a statement (true only half 
the time) that one of the two stimuli was most often or least often 
chosen. Contra-suggestibility was also measured in this situation. 
Tests of originality consisted of directions to draw a figure as differ- 
ent as possible from a circle, to draw a figure as different as possible 
from a rhombus with a diagonal, and to give words as different as 
possible from such stimuli as mind, cause, substance, and heaven. A 
number of studies have been made of the reaction of people to photo- 
graphs, beginning with Ruckmick, then Langfeld, then Laird and 
Remmers. Recently Gates has used the ability to identify emotions 
shown in photographs as a measure of social perception, and has found 
very definite age levels. 

Among the other unusual and interesting tests listed in the Hart- 
shorne-May summary were tests of aggressiveness and strength of 
instincts (Moore) measured in terms of the amount of distraction a 
subject would stand; test of ability to foresee consequences (Chassell) 
obviously an essential aspect of desirable social conduct; tests of con- 
formity (Deutsch) measured in terms of the conventionality of one’s 
preferences in the line of beauty, marriage systems, modes of trans- 
portation, superstitions, costumes, ideas about immortality, etc.; a 
test of achievement-capacity (Fernald) which perhaps might be called 
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will-power, measuring the length of time men are willing to stand on 
their toes; a test of historical judgment (Van Wagenen) which meas- 
ures the tendency of pupils to project certain motives rather than Biss 
others into the interpretation of the conduct of persons in history; ae 
and a test of money-mindedness (Shuttleworth) based on the Hart + 
scheme of finding items in a list most strongly liked and disliked. 

Thus far summaries have been made of rating scales, physio- 
logical measurements, and the tests which have been listed in previous 
reviews. The remaining task is to study the tests and question- 
naires pertaining to personality traits which have not been included 
f in previous summaries. For convenience, these will be presented as 
y tests of significant information, tests of environment, tests of attitude 
4 and emotional state, tests by the free or controlled association method, } 
f other tests using pencil and paper, and a few miscellaneous tests. 

1 Recently Miss Orr has added to the tests of significant infor- 
2 mation by developing in connection with the Character Education 
s Inquiry, a test of good manners, which seems to be closely related 
p to home background. Gates and Strang published in June a “‘ Health 
y Knowledge Test”’ for children. Miss Schwesinger has compiled after 
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careful research a test of social-ethical vocabulary which has a reli- 
ability of .99, and which correlates highly with other vocabulary 
tests. Various tests of religious ideas and information have been 
developed, particularly the Union tests, and Porter’s advanced bible 
knowledge test. 

Environmental factors are commonly measured in terms of the 
Whittier scales for grading homes and neighborhoods. Chapman and 
Sims in September published a scale for measuring socio-economic 
status, the relative significance of its items being determined not by 
opinion but by statistical studies of association. 

Attitude tests began with compositions by students on their 
reaction to moral problems. A second step was the setting of specific . 
forms of reaction. Thus Tanner and Barnes have found that students i 
will in large degree condemn acts which they would take no steps 4 
toward preventing. Still further refined, these questionnaires have | 
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taken the form of ‘ethical discrimination tests,’”’ such as the Union 
tests, or the various examinations developed by the national committee 3 
on religious examinations of the Y.M.C. A. One of the best attitude 4 
studies of this type was made by McGrath who tested 4000 children " 
} using question and answer tests, pictures, short stories with possible 
completions suggested, and a vocabulary test. Probably the best 
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tests of the intellectual factors involved in character are the C. E. I. 
tests which are being developed by Hartshorne and May. A resumé 
of their work will be found in Religious Education, each issue during 
1926. A test of “opposites” having a reliability of .96 if lengthened 
to require one hour of time for taking, was discarded because of the 
Schwesinger study in social-ethical vocabulary. A “similarities” 
test was likewise developed but not used. The “‘ word consequences” 
test asks subjects for the most probable consequences of such acts as 
lying, gambling, etc. and for the selection of best and worst among 
those consequences. A “cause and effect’ test studies perception of 
the relationships of social phenomena by a true-false test having a reli- 
ability for one hour of .88 and a correlation with unweighted criteria 
of .52. The “duties” test asks subjects to discriminate among acts 
which are or are not moral duties, this test having a reliability of .95 
for an hour’s time and a correlation of .54 with the criterion. The 
‘“‘comprehensions”’ test asks pupils which they would do or say in cer- 
tain typical situations and has for an hour, a reliability of .90 but a 
correlation of only .37 with the criterion. The “provocations” test 
endeavors to stretch conventional moral responses by introducing 
counter-provocations to see what the subject believes it best to do in 
such a situation of conflict. The reliability here would be .90 for an 
hour, the correlation with criteria 42. The “foresights” test 
endeavors to obtain pupils’ notion of consequences when none are sug- 
gested, the ‘‘recognitions”’ test measures ability to classify correctly 
certain acts under general terms like dishonesty or impurity (reliability, 
89; validity, .58). The “principles” test studies knowledge of con- 
ventional ethical principles by a true-false method (reliability, 
.92; validity, .64), and the “applications” test (reliability, .91; valid- 
ity, .42) measures ability to apply these principles to situations. Out 
of this total battery significant test elements have been assembled in 
new and shorter forms and are undergoing further study. 

Along a slightly different line, attitudes have been studied in 
relation to the interests of life. Freyd has presented a test for voca- 
tional interests which includes attitudes toward occupations, recrea- 
tions, all things, it seems, whether artistic, scientific, literary, social, 
mechanical or solitary, and ranging from ‘long walks” to “fat 
people” as objects for liking or disliking. Greene has studied 
“usual feelings” with a questionnaire which reveals xsthetic, logical, 
and social trends. Clinchy proposes to measure the influence of college 
on men by means of an omnibus test including attitudes toward every- 
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L. thing from the League of Nations to poetry about moonlight. Travis 
é has developed two forms of a test in which the subject is called upon 
g to rank in order of preference, traits chosen to illustrate certain clinical 
d types of individuals. He finds a self-correlation of .94 and a correlation 
e with associates’ rankings of .07 to .49. Myerson studied person- 
” ality with a multiple choice test in which cynical, humerous, conven- 
” tional, naturalistic, and pessimistic responses were provided. He was ee 
interested in the method, time, and form of choice as well as | 
its trend. Harper is now completing his study of social beliefs and 
attitudes in teachers, finding, interestingly enough, a negative 
correlation between conservatism and change of opinion. Sturges’ 
various unpublished “Studies in the Dynamics of Attitude” have 
consisted of question-ballots showing a person’s opinion on militarism- 
pacifism, or fundamentalism-modernism, and similar issues. These 
tests have been applied before and after the stimulus of reading 
certain books, hearing certain speeches, taking certain courses. | 
By varying the time of exposure to given stimuli he has built up a 4 
curve which may give significant data on the factors producing change a 
in attitude. Symonds constructed an attitude test for the separation f 
of the liberal-progressive-radical group from the conservatives. His hy 
criterion was the judgment of five people. He found a reliability of iad 
.67, a correlation between information and liberalism of .36, but also et 
that the liberalism score did not rise through the grades of school and 
college as did information. Porter at the University of Chicago has 
developed a very exhaustive test of attitudes toward war, which he is 
t using in his study of the R. O. T. C. The writer published during the 
past year his “‘ Measurement of Fair-mindedness”’ setting forth studies 
: with a test called “‘ A Survey of Public Opinion” but combining really six 
different methods of measuring prejudice upon religious and economic 
| matters. The test appears to have a reliability in determination of , 
gross prejudice score of .96, a correlation with criteria of about .80 ae 
| and much less reliability and validity in the determination of the 
. exact degree of prejudice in each of 12 suggested typical directions. 
Bogardus has developed recently his “‘Social Distance Test,” which 
asks subjects to indicate the relationships in which they would be 
willing to see other races, other religious groups, or other economic 
classes. The relationships range through seven stages, from the 
intimacy of marriage to the opposite extreme at which the subject 
would be unwilling to admit such people to the country at all. 
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Along a slightly different line, two variations of the Woodworth 
questionnaire have appeared. The Colgate mental hygiene test 
developed by Laird has been given very extensively, and tends to 
classify subjects according to certain temperament types developed 
in abnormal psychology (introvert, extravert, etc.). It has a reliabil- 
ity between first and second takings of .67; between first and second 
halves of .45. Chassell working with the writer has developed 
a similar “Emotional History Record” containing the types 
of question which Laird uses and in addition, questions which he, 
doubtless more wise, expurgated; also a self-rating scheme based 
on the Knight-Franzen notion of comparing self with average and 
ideal, and a personal history blark that records childhood experiences, 
at home and school and among friends, which may have been significant 
in producing the attitudes revealed by the first part of the test. 

Methods of word-association continue to yield interesting and 
promising results. Washburn, Morgan, Harding, Simon, and Tomlin- 
son have studied moods of depression and cheerfulness by using 
50 stimulus words, and urging subjects to continue the association 
until a pleasant or unpleasant response occurred. This was repeated 
with different words for five days. A criterion was built by 
ratings of self and friends. Sixty-five per cent of the highest 
quartile in cheerfulness stood highest in pleasant associations, 
with 33 per cent of the lowest group also in the lowest quartile. 
Raubenheimer, in one of the most significant studies of delinquent 
boys yet made by tests, used in addition to an over-statement 
test about number of books read, character preferences, reading which 
the boy enjoyed (on the basis of fictitious but suggestive titles), 
activities enjoyed, rating of offences with respect to their gravity, 
and willingness to make other over-statements, a controlled associa- 
tion test, using words like “teacher,” ‘policeman,’ ‘‘smoking,’’ etc. 
Laslett improved this technique by eliminating words which did not 
discriminate between groups, and keeping words such as ‘‘term”’ 
which /received an association of “jail” with one group or 
“‘scheol’’ with the other, or ‘‘bar”’ which to one group meant saloon, 
to the other, candy. He found a reliability of .82. Chambers used 
the Pressey expurgated list and selected words which differentiated 
Grades VI to VIII from Grades X and XII. These 94 words he calls 
& measure of emotional maturity. A most significant study in 
this method has been made by Stumberg, who found that 
sophisticated subjects could very effectively prevent detection 
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by word-association methods. Some of the techniques they used were 
lengthening all reaction times, lengthening reaction times willfully 
at irrelevant points, associating non-crucial words with later ones 
in the test, having certain words in readiness, and using a given mental 
set which produced words of a certain kind more or less regardless of the 
stimulus. 

The pencil and paper tests comprise, naturally, the largest group. 
Some of them are simply adaptations of existing tests. Thus “‘cau- 
tion’? may be measured by the relation of attempts to errors on intelli- 
gence tests. The AQ has long been familiar as a measure of effort 
and teaching efficiency. Symonds suggests as a measure of studious- 
ness, the difference between the sigma position of a subject in his score 
on an announced quiz, and his sigma position in an intelligence test. 
Using term marks instead of such a quiz, he found that this trait 
correlated .65, or higher than intelligence, with marks. 

Washburn found as a measure of emotionality, that the number 
of emotional experiences, either pleasant or unpleasant, which a 
subject could recall served as a good index. Using two groups 
separated on the basis of ratings by self and friends, only 16 per cent 
of the calm group made more than 30 recollections, while 60 per cent 
of the emotional group recalled more than 30 strong emotional experi- 
ences. Burtt in addition to the study of reaction time mentioned 
above, measured interest in terms of memory for word pairs, where 
some were significant of the given interest, others irrelevant. Later 
when one word was pronounced the pairs remembered seemed signifi- 
cant of interest. McGeoch used tests of imagination based on the 
number of associations to 10 ink blots, the number of words constructed 
from two sets of six letters each, and the number of sentences built 
from two sets of six words each. He found that counting the words 
used in the ink-blot association gave a measure of quality as well as 
quantity of associations. The tests are reported to have a reliability 
of .60. Chapman, as reported in the Hartshorne-May summary, 
used a similar word-building task as a basis for studying persistence, 
success, and speed. 

A few suggestive measures have been made without the too 
frequent paper and pencil situation. May secured a measure of 
the industry of students at Syracuse on the basis of time records 
kept for a week at the beginning of the year and near the middle 
of the year. It proved a factor very significant in the prediction 
of term marks. 
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Tests for humor have been suggested. The writer has been 
interested especially in using the Healy Completion Form-Board with 
instructions to substitute not the most appropriate but the funniest 
object in each blank. 

The most thorough-going character tests have been those of 
dishonesty in school situations developed by the character education 
inquiry, through the work of Hartshorne and May. In varied forms 
such tests have yielded reliable and obviously valid data regarding the 
actual conduct of several thousand children. More complete reports 
of this work will be made in their reports and publications. 

Several interesting experiments have been made, not so much in 
the development of new instruments as in the trying out of old ones. 
Lowe and Shimberg investigated the fables of the Stanford-Binet as a 
possible test to discriminate between 1400 delinquents and 500 non- 
delinquents. There seemed to be no relationship at all with any 
moral factor, but the usual close correlation with mental age. Olson 
tried the Pressey X-0 on 24 subjects of the Manhattan State Hospital 
and found the idiosyncrasy measure quite significant, a variation of 
54 per cent from the norm, although the affectivity score was only 
5 per cent below the norm. Chambers found that the Pressey X-0 
predicted college achievement, as well as did intelligence scores. 
Terman in his study of gifted children gave five tests used by Rauben- 
heimer, and two formerly used by Cady. He found that 86 per cent 
of the gifted children equalled or exceeded the median of the normal 
control group. How far this was due to increased insight into the 
purpose of these conduct tests, he does not consider. Lentz has 
done an excellent piece of work in trying out tests on delinquent 
groups and on non-delinquents who are equivalent in intelligence, 
education, home status, etc. He found that tests which differentiated 
in one group might not apply in others. Murdoch recently reported 
an interesting endeavor to measure differences in moral traits between 
the races found in Hawaii schools. Questionnaire, grades, marks, 
occupation, teacher estimates, ratings on the Upton-Chasell citizen- 
ship scale, score on N. I. T. and army beta, Seashore pitch discrimina- 
tion tests, and Voelker’s “circles” test of trustworthiness, were utilized. 
She found that social status correlates highly with IQ and morality, 
that Anglo-Saxons and Orientals are highest in intelligence, but that 
the Orientals, especially the Chinese, excel in moral traits. 

In conclusion it may be permitted the writer to state the impression 
which the survey of this imposing array of material has left upon him. 
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First, there is the appreciation of the extent of work already done. 
It should no longer be necessary for a writer to begin his work, as did 
one in 1924, ‘‘ Despite the recognized importance of personality, there 
has been no attempt, so far as the author is aware, to analyze it into 
its constituent factors.”” Any such statement, at least in the future, 
will be a confession of the unfortunate ignorance of the author. It 
is almost impossible to find a trait for which an adjective exists which 
has not been approached with some degree of suggestive investigation. 

A second, and fundamental problem, however, concerns the 
“fakability”’ of most of the measures. Aside from the rating scales 
and physiological measures, there are comparatively few in which an 
individual who caught on to the purpose of the test could not raise his 
score at will. Sometimes this purpose is disguised by a title, atype 
of direction, or a complicated scoring scheme. With the possible 
exception of the very complicated scoring schemes, however, all such 
tests will be limited in usefulness. If ever the great body of workers 
with school-children, delinquents, and industrial applicants were to 
start using tests of the present sort as extensively as intelligence tests 
are now used, the results would soon not be worth a whistle. 

A third impression, possibly too carping, concerns that group who 
continue to publish investigations in the field of personality traits 
on the assumption that since nothing of importance is known, every 
little will help. They feel it unnecessary to safeguard ratings, to 
correct for errors, to study reliability, to investigate validity, or to try 
out differentiating tests on groups other than the ones upon which 
they were differentiated. _ Often tests may be meticulously studied 
but criterion groups very superficially selected. Correlations are 
used reckless]y without regard for the linearity of regression. As one 
who is certainly open to criticisms of this sort, the writer here repents, 
and aspires to join the great majority whose careful, painstaking, 
ingenious contributions are so enriching knowledge about and control 
over the less tangible aspects of human behavior. 
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NEURO-MUSCULAR CAPACITY OF CHILDREN WHO 
TEST ABOVE 135 IQ (STANFORD-BINET)* 


JANE E. MONAHAN 
Public School 165, Manhattan 


AND 
LETA S. HOLLINGWORTH 


Teachers College, Columbia University 


It has already been shown in previous publications of data,*-4 
that young children testing above 135 IQ (Stanford-Binet) excel 
children matched with them in age, race and sex, but unselected 
for intelligence, in speed of wrist tapping and in grip. The experi- 
ments to be reported here bear upon neuro-muscular capacity in 
two additional kinds of performance—the standing broad jump 
and chinning—and a further study is here made of power to squeeze 
with the hand. Children in the special opportunity classes at Public 
School 165, Manhattan, described previously,** constituted what is 
herein called the experimental group. These children were boys and 
girls, ranging at the time of the investigation here set forth from 108 
months to 149 months, in birthday age, with a mean at 135.2 months. 
They ranged in IQ from 135 to 190, with a mean at 152. 

Our control group was formed by the following method. Every 
child in the experimental group was paired with a child selected 
from the registers of the regular classes in the same school, where 
children of average scholastic performance are placed. The control 
child was in each case of the same sex, of the same age within a month, ft 
and judged to be of the same racial stock as the experimental child, 





* This report is rendered as part of the work of a joint committee, in charge 
of special opportunity classes for gifted children, at Public School 165, Man- 
hattan. The members of this committee are Mr. Jacob Theobald and Miss Jane 
Monahan, of Public School 165, Miss M. V. Cobb, Dr. Grace A. Taylor and Dr. 
L. 8S. Hollingworth, of Teachers College, Columbia University. The work was 
carried on for three years, with the advice of District Superintendent John E. 
Wade, and in cooperation with the division of educational psychology, of the 
Institute of Educational Research, at Teachers College. The work of this report 
was supported by funds granted through the institute, by the Carnegie corporation 
of New York. 

t This maximum difference of one month, plus or minus, in age was exceeded 
in but two of the 45 pairs, so chosen as to cancel each other (see Table VI). 
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against whom he or she was matched. The IQ’s of the children 
in the control group were not taken, but scholastic status and 
group tests of intelligence indicate IQ’s clustering close around 
100. No very bright or very dull children are included, as these are 
placed by means of mental tests, in special classes at Public School 165. 
Since girls do not attend the regular classes of this school above the 
fourth grade, ten girls to match those in our experimental group 
were selected as described above from Public School 93, Manhattan, 
by courtesy of the principal, Miss Laura Charlton. 


THE STANDING Broap JUMP 


The standing broad jump was taken in the conventional manner, 
in the gymnasium of Public School 165. The work was done between 
the hours of 9 and 12, in the forenoon. Miss Mary Ducey, teacher 
of physical training, supervised the jumping in all cases, and deter- 
mined the record, which was taken to the nearest half inch with a 
tape measuring from the toe mark on the mat to the back of the heel 
landing in the rear at the end of the jump. The mat used was of 
rubber, an item of standard gymnasium equipment, marked off in 
three inch spaces. The children wore rubber-soled shoes, except 
that a few pairs jumped in stocking feet. Ordinary indoor clothing 
was worn, all coats being however removed. The audience in all cases 
was composed of the 44 children of the particular group being tested, 
who were not at the moment jumping, and the three adult investi- 
gators. Practice in jumping had been had by both groups, in the 
ordinary course of the school. It would have been impossible to 
determine whether the members of each pair had previously jumped an 
equal number of times, and no attempt was made to take this into 
account in matching the children. 

The conditions and the procedure were thus as nearly identical as 
was possible for both comparative groups. Each control child jumped 
in the same room, on the same mat, during the same forenoon, under 
the same instructions, given by the same person, under the same 
conditions of foot-wear, audience and illumination as his experimental 
associate. Every child in both groups took two trials, separated in 
each instance by the interval required for the whole group of 45 
children to complete one trialeach. The record compiled for the com- 
parison is in every case the better of the two trials. 

Table I gives the comparative distribution of the groups, in terms 
of inches jumped. By reference to Table VI, it will be seen that the 
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mean jump is 58.66 inches for the experimental group, and 58.84 inches 
for the controls, with a PE, of 1.11 inches. There is, therefore, no 
demonstrated difference between the gifted and the ungifted groups, 
as concerns the distance jumped, since the difference between the 
means is very much less than the error probable from sampling. 

Correlation between the two trials in jumping, for the 90 subjects 
used as one group, yields r = .89 + .01. 


TABLE I1.—DisTRIBvuTION oF INCHES JUMPED BY GIFTED CHILDREN, AND BY CoN- 
TROLS OF ORDINARY INTELLECTUAL PERFORMANCE, IN THE BETTER OF 


Two TRIALS 
INCHES GIFTED ConTROL 
JUMPED Group Group 
75-70 1 4 
70-65 9 8 
65-60 10 1l 
60-55 10 6 
55-50 12 5 
50-45 1 10 
45-40 1 0 
40-35 1 1 

45 45 
CHINNING 


Chinning was carried out in the manner usual in school gym- 
nasiums. It was performed in the gymnasium of Public School 165, 
in the hours between 9 and 12, in the forenoon. The wooden hori- 
zontal bar which is a standard item of gymnasium equipment was used. 
In one of the two trials given this bar was set at 48 notches from the 
floor, and in the other trial, at 52 notches. Miss Ducey also gave this 
test to every child, and determined the record in each instance. The 
child, standing on a stool, grasped the bar with both hands, knuckles 
outward, and let the body down as the stool was slipped from under. 
He or she then endeavored to raise the body, held as nearly rigid as 
possible, as many times in succession as could be done, to rest the 
chin on the horizontal bar. 

In this performance coats were removed, other ordinary indoor 
clothing being worn. Here, also, the audience consisted of the inves- 
tigators and the members of the group being tested. Each control 
child was tested in the same place, on the same bar, during the same 
forenoon, by the same person, with the same instructions, before an 
audience of the same size as his or her experimental associate. Three 
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pairs for whom there are data in other respects here studied are omitted 
from this performance because of asthma, wrist not yet fully recovered 
from fracture, and stiff neck, afflicting either member, which would 
invalidate a trial in chinning. 

Table II gives the comparative distribution of the groups, in terms 
of the number of times each child succeeded in chinning himself 
or herself. By reference to Table VI, it will be seen that the mean 
for the experimental group is .98, and for the controls 1.67, with a 
PE, of. 27. This indicates no high degree of reliability of the differ- 
ence between these means; but inspection of the distributions in Table 
II shows that children who can chin themselves more than once are 
much more frequent among the controls than among the gifted. The 
record of a single attempt, moreover, seems to be a highly reliable 
indication of capacity to chin, as the correlation between the two 
trials given, for the 90 subjects, yields r = .90 + .01. These facts in 
conjunction with the difference in the averages, gives weight to the 
conclusion that the gifted children are actually excelled by control 
children in power to chin themselves under identical conditions. 


TaBLe II.—DistripuTion or Successes IN CHINNING, BY GIFTED CHILDREN 
AND BY CONTROLS OF ORDINARY INTELLECTUAL PERFORMANCE, IN THE 
BETTER OF Two TRIALS 


NuMBER OF GirTEep ConrTRoL 
Successes Group Group 
0 24 22 
1 8 3 
2 4 8 
3 3 3 
4 0 1 
5 2 1 
6 1 0 
7 0 1 
8 0 2 
9 0 0 
10 0 1 
42 42 


GRIP IN THE HAND 


Grip was taken by means of the Smedley dynamometer, the same 
instrument being used for every child in both groups, since it has 
been demonstrated that these dynamometers differ in accuracy, unless 
kept constantly in calibration. The records were taken by Dr. Hol- 
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lingworth. Three trials, taken alternately by right hand and by left 
hand, were given for each hand. The record used in compiling Table 
III was in every instance the best of these six trials, regardless of the 
hand in which it was made. <A majority of children in both groups 
made the best record with the right hand, but some in each group made 
the best with the left. 

The record for each control child was taken within a week of that 
for his experimental associate, half being taken a week previously, and 
half a week subsequently, so that no constant error in either direction 
arises from the discrepancy of a week or less, in development. Ka 

Table III shows the comparative distribution of the groups, in 
terms of the kilograms registered on the dynamometer, by the best 
trial in six, with either hand. Reference to Table VI will show 


TasLe II].—DistrwvutTion or STRENGTH oF GRIP, IN KiLoGRaMs, OF GIFTED 
CHILDREN AND OF CONTROLS OF ORDINARY INTELLECTUAL PERFORMANCE 


GIFTED ConTROL 

KILos Group Group 
40-35 0 1 
35-30 4 3 
30-25 14 8 
25-20 18 23 
20-15 9 8 
15-10 0 2 

45 45 


that the mean grip for the gifted is 25.0 kilograms, while the mean 
for the controls is 23.4 kilograms, with a PE, of .48. The difference 
thus found is not much more than three times the probable error of 
sampling. However, it is no doubt real, since the superiority of the 
gifted in grip has already been demonstrated by the present investi- 
gators, by measurements taken on these same children at an earlier 
age, in comparison with a different control group,? by Baldwin, working 
with Terman’s subjects in California,® and by others. 


THE Broap JuMP IN RELATION TO WEIGHT LIFTED 


Up to this point, the present investigators had before them the 
facts that though the intellectually gifted excel ordinary school chil- 
dren in the motor performances of tapping and of gripping, they are 
but equal to the latter in the broad jump, and are inferior in power to 
chin themselves. An important element appears in both jumping 
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and chinning, which does not appear in tapping or gripping. In 
jumping and in chinning the body weight must be lifted. 
‘The investigators also knew from comparative measurements car- 


ried out by themselves, and from other investigations, that intellectu- ,~ 


ally gifted children are heavier than ordinary children, as a group; 
and not only heavier, but plumper—that is, they have a relatively 
greater weight-height coefficient. It occurred to them, therefore, 
to determine the comparative ability of the gifted to lift body weight, 
and they adopted a method for doing this in the case of jumping. 

Comparison in terms of foot-pounds is not exactly feasible since in 
the broad jump the body is not lifted vertically, but in a forward 
moving arc. A measure somewhat analogous to measurement in 
foot-pounds may be achieved in the present instance, by multiplying 
the number of inches jumped by weight in pounds, and dividing by 12. 
This gives a measure of the amount of work done by a given neuro- 
muscular system in propelling a given body weight. Perhaps an 11- 
year-old boy, weighing 135 pounds does as much work in jumping three 
feet, as is done by a boy of like age, who jumps five feet, but weighs 
only 80 pounds. In racing horses, for instance, it is recognized that 
even a pound of extra weight to carry constitutes a handicap. 

All of the children involved in the present study, both in experi- 
mental and control groups, were weighed on the same scales (Toledo- 
No-Springs scales), by the same examiner, during the week following 
the measurements of performance. Table IV gives the comparative 
weights of the two groups. By reference to Table VI it will be seen 
that the mean weight of the gifted is 88.10 pounds, while for the control 
group it is 81.24 pounds. This difference is what would be expected 
from previous studies of the comparative size of gifted children. 
The gifted had to raise on the average about 7 pounds more of body 
weight, in jumping and in chinning, than their paired controls had to 
raise, an excess aggregating approximately 300 pounds. 

To gain further insight into the relationship between weight and 
capacity to perform the broad jump, a correlation was computed 
between body weight, and inches jumped in the better of the two 
trials given. This correlation was limited to the 54 of our 90 subjects, 
who had passed the eleventh but had not reached the twelfth birthday. 
For these ll-year-olds, weight in pounds correlated with inches 
jumped yields r = —.22 + 09. 

It seems reasonable to infer, therefore, that surplus of body weight 
constitutes a handicap in events where the body must be lifted by its 
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neuro-muscular mechanisms. Subjects of exactly the same birthday 
might yield a correlation even more highly negative. It will be 
recalled in this connection, that the correlation between two trials in 


jumping yielded r = .£89 + .01, showing that the data upon which 
the negative correlation is founded are very reliable. 


TaBLE 1V.—DistTRIBsvuTION oF WEIGHT oF GIFTED CHILDREN AND OF CONTROLS 
or ORDINARY INTELLECTUAL PERFORMANCE 


GirTED ConTROL 

PouNnpbs Group Group 
140-135 1 0 
135-130 1 0 
130-125 0 0 
125-120 0 0 
120-115 0 0 
115-110 2 2 
110-105 2 1 
105-100 4 1 
100— 95 4 5 
95- 90 5 3 
90-— 95 1 2 
85- 80 5 10 
80- 75 10 4 
75- 70 5 5 
70— 65 4 6 
65— 60 1 5 
60— 55 0 1 

45 45 


Measurements of height were not taken in this instance, but we 
know from previous studies that the gifted are slightly taller than 
ordinary children, as well as heavier. In the cases here studied, the 
gifted carried their surplus of body weight as far as the controls 
jumped. In this feat slightly greater tallness may have been a 
factor, as well as the superior neuromuscular vigor shown in tapping 
and in gripping. Tallness would, however, hardly serve as an aid in 
chinning. This may account for the fact that in the latter perform- 
ance they did not overcome the handicap of weight. 

Table V shows the distribution of the measurements described 
above as analogous to foot-pounds, namely, the number of feet 
jumped multiplied by body weight in pounds. Table VI states that 
the mean for the gifted group is 426.0, while for the control it is 399.0, 
with a PEz of 11.68. This difference, in favor of the gifted, is not as 
much as three times the probable error. However, inspection of 
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TaBLeE V.—DiIstTRIBUTION OF FEET JUMPED X WEIGHT IN PouNDs, or GIFTED 


CHILDREN, AND OF CONTROLS OF ORDINARY INTELLIGENCE 


Fauer Jompzp GirTep ConTROL 
X< Wuicur Group Group 


640-620 
620-600 
600—580 
580-560 
560-540 
540-520 
520-500 
500-480 
480-460 
460-440 
440-420 
420-400 
400-380 
389-360 
360-340 
340-320 
320-300 
300-280 
280-260 
Below 260 
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Table V shows that the greater mean performance of the gifted is 
not due to a few extreme cases, but to a majority of cases in the distri- 
bution. Greater numbers, or repetition of the experiment, would 
probably prove the difference in favor of the gifted to be real. 


SUMMARY OF FINDINGS 


From studies made on gifted children at Public School 165, Man- 
hattan, the following are the findings summarized in Table VI: 

1. Intellectually gifted children excel children from the regular 
classes, in grip in the hand, and in wrist tapping, which exemplify 
performance not involving body weight. 

2. In the standing broad jump, where body weight must be raised, 
the gifted are equal to, but do not surpass, their ordinary school mates. 
In this performance, the gifted of the ages studied carry on the average 
about 7 pounds more of body weight than their experimental competi- 
tors carry. It is inferred that they overcome this handicap partly 
through their superior height, and partly through their superior neuro- 
muscular energy, revealed in tapping and gripping. 
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Taste VI.—Summary or Data, SHowrna Means, Mean DEviaTIONS, AND 
PROBABLE ERRORS OF THE DIFFERENCES BETWEEN MEANS, IN AGE, 

WEIGHT AND Motor PERFORMANCES, FOR GIFTED CHILDREN 

AND CONTROLS 











Experimental Control 
group group 

Mean | MD Mean | MD PEga 
RS, i cea a eeedudteeenee 135.2 6.20 | 135.1 6.40 | 1.13 
0 er re 25.0 3.29 | 23.4 3.56 .48 
Broad jump (inches)............... 58.66 | 5.72) 58.84) 7.44; 1.11 
hide aic iene peandgan .98 | 1.12 1.67 | 1.72 .27 
Weight (pounds)................-. 88.10 | 13.86 81.24 | 11.15 2.18 
Ny Fs vn see sa dedcdesates 426.0 | 57.53 | 399.0 | 70.33 | 11.68 




















3. In chinning, the gifted are inferior to their ordinary school mates. 
It is inferred that this inferiority is due to their handicap of body weight; 
and that their superiority of neuro-muscular energy is not alone suffi- 
cient to overcome this (superior height being of no assistance in 
chinning). 
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VARIABILITY IN AMOUNT OF DIFFERENT TRAITS 
POSSESSED BY THE INDIVIDUAL 


CLARK L. HULL 


University of Wisconsin 


A vast amount of attention has been given to the shape and range 
of the distribution of individuals and on the greatest variety of traits. 
It is almost a matter of course for a work which deals with differential 
psychology even incidentally, to give a number of graphs of the distri- 
bution of individuals on particular traits. A favorite figure for this 
purpose is Terman’s well known distribution of intelligence of children 
as measured by the Stanford revision of the Binet-Simon tests. These 
figures will ordinarily be followed by comments on the magnitude and 
importance of the differences in talent thus displayed. Attention 
will be called to the varying closeness of approximation of the different 
distributions to the normal probability curve. Having followed this 
line of exposition through to its conclusion, the subject is left as com- 
pletely, or at least adequately, treated. 

It is the purpose of this article to show that the conventional 
account of differential psychology thus briefly outlined above, is 
neither complete nor adequate. On the contrary, it leaves out of 
account a whole division of the subject, one which is likely to prove of 
greater importance than the aspect which has received so much atten- 
tion. This is the variability within the individual himself. The 
commonly treated phase of differential psychology concerns—the ~ 
differences among individuals in a single trait. The trait is constant, 
the individuats-vary. This ignores completely the i immensely impor- 
tant matter of the differences in amount of various traits possessed by 
each individual. From this latter point of view the individual is 
constant, the traits-vary. 

The first and better known aspect of differential psychology has 
generally come to be known by the name of individual differences. 
This is eminently appropriate since it deals with the differences among 
individuals. Similarly we shall call this second and neglected phase 
of differential psychology 1 trait | differences, since it deals with the differ- 
ences among traits. 

The problems involved in the subject of trait differences are similar 
in some respects to those of individual differences. We wish to know 
for example whether the distribution of talents within the individual 
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shows the familiar bell-shaped contour, with the various resulting 
implications. A problem of greater significance is the extent or range 
of these differences. It is obvious, for example, that if no differences 
\exist, z.e., if the various traits of an individual are all equal or approxi- 
mately equal in amount, he can do one thing about as well as another. 
|\ This would mean that special aptitudes do not exist and that time spent 
in seeking them will be ‘wasted. It would, for example, preclude the 
possibility of a scientific vocational guidance. Such, in fact, seems 
: substantially to be the view recently put forward by Kitson.’ If, 
however, it should turn out that the various talents of the individual 
are highly diversified, possibly to a degree approximating the varia- 
bility of groups on single traits, a presumption would be established 
in favor of significant specialization of vocational aptitudes. A third 
problem is whether people are all equally variable in the amounts of 
the various talents possessed by them. Or shall we find that the 
talents of one person will cluster closely together around one partic- 
ular point on the scale while those of another may be scattered very 
widely over all parts of the scale? 

The data which furnish the basis of the present investigation were 
placed at the writer’s disposal by Mr. Charles E. Limp. They consist 
of 35 complete series of test scores on 107 first year high school — 
students.” “OF these test scores 25 were obtained from various group 


“intelligence” tests and 10 were based on June Downey’s group will- 


temperament tests. All of the intelligence tests were scored in such 
a way that a large score on a test represented a “good” score. What 
constitutes a ‘‘good”’ score on the Downey tests is not so obvious, but 
they seem actually to be scored on this basis. This is indicated by 
the fact that with high school algebra as a criterion they either cor- 
relate positively (six cases) or show small and non-significant negative 
correlations (four cases).* The list of tests employed, together with 
a certain amount of explanatory material, is given in Table I. Test 
scores from three subjects chosen for purposes of illustration are shown 
in columns 2, 3, and 4 of Table IT. 

The procedure of the present investigation was first to convert or 
transmute each of the 35 sets of test scores into new series each one of 





1‘*The Psychology of Vocational Adjustment.” Pp. 226-230. 

2 It is a pleasure to acknowledge this indebtedness to the generosity of Mr. 
Limp. The data had been obtained by him with the greatest care in connection 
with an investigation of shorthand and typewriting aptitudes. 

* See this Journal, February, 1925, » P- 76. 
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TaBLe I.—SHowING THE Various Tests EMPLOYED IN THE PRESENT 


INVESTIGATION 

IDENTI- 

FICATION 

Name or Test NuMBER 
a ee ee a 1 
ee CT ee, sos awn e swipe’ esewcedes 2 
ee I IIIS, TD, , oc ccc eos sies Ov ese be ecee 3 
Logical selection (Terman No. 4)..................20000e000: 4 
PY I, ck conc ccaibevepnceidecvecsadecne 5 
Sentence meaning (Terman No. 6)...................2020000: 6 
ee cn. cd ciee bab Cedipe edb eddevens 7 
Mixed sentences (Terman No. 8).................0-0-000ees 8 
Classification (Terman No. 9)................ SEE ee 9 
Number series (Terman No. 10).............. _, a ee nae 10 
Motor reaction—dotting squares (Hoke No. 1)................ 11 
ES GE RE CI US, BD ng wc iccc cnc ws bacseccvcsccscss 12 
Quality of writing (Hoke No. 3)....................020200 08: 13 
Speed of reading (a kind of completion test) (Hoke No. 4)...... 14 
EE RA con SO et EAS EON pee 15 

Spelling—choice between correctly and incorrectly spelled words 

NL. . 1% a's oe weds ak po Go daa eee ik eae 16 
Symbols—a digit—symbols substitution test (Hoke No. 7)..... 17 


Speed of decision—self estimates of character traits (Downey).. 18 
Coordination—ability to write down long words on short lines 


ih idea 4d 6s eben beh en dee 19 
Freedom from load—difference between speeded writing and 
ordinary writing rate (Downey).................0eeeeeeee 20 
Motor inhibition—test of how slowly pencil can be moved while 
RE PE Pe a a, Se rene Pee 21 
Volitional perservation—ability to disguise handwriting 
eR kt ke an 65 wen ee eae 22 
Interest in detail—ability to imitate handwriting (Downey).... 23 
Motor impulsion (Downey).................. ae dee Ra ean Ri 24 
ss ccc cuseceeesesouseeseeeepb eee 25 
i  . . coe secede eh etna ale oaewenaee 26 
Finality of judgment (Downey)... ...........0.0.sscecececes 27 
Easy directions (Woodworth-Wells).......................... 28 
ees cs cess curiaceivns amp cndtacne eer 29 
De ate ole eae pbs eheeesin tence be beada gees 30 
ee oe i Oe lid weehnn deaekemamesta wen 31 
i le isd ohh cs arable we x Bila ead Bhs ht 32 
Coordination of reaction (dotting squares, Henmon)........... 33 
Courtis addition—fundamentals.....................02.0005: 34 
Courtis multiplication—fundamentals........................ 35 


PP rn Par METRE 

















100 


Taste II.—Ssowimne Trpicat Test Scores, CorRESPONDING TRANSMUTED 
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ScorEs, AND CORRELATION BETWEEN TEST PERFORMANCE AND INDI- 
VvIpUAL TALENT VARIABILITY 




















Test scores of typical Corresponding transmuted en 
Number subjects scores . . 
with trait 
of test variability 
(Table I) | Subject | Subject | Subject | Subject | Subject | Subject (r) 
A B C A B C 
1 1 16 17 61.8 90.1 92.0 — .06 
2 10 18 22 74.8 86.5 92.3 —.10 
3 2 10 12 75.2 82.9 84.8 + .17 
4 0 7 13 | 64.3 | 79.5 | 92.4 | +.04 
5 0 6 10 69.8 80.1 86.9 + .12 
6 8 6 10 79.5 76 .6 82.4 — .04 
7 4 8 12 66 .9 74.9 82.9 +.14 
8 5 4 16 80.3 78.5 99.8 + .16 
9 9 15 15 69.7 88.8 88.8 + .18 
10 2 10 18 70.3 79.8 89.3 + .07 
11 56 60 65 77.9 80.5 83.8 — .08 
12 28 55 61 59.9 81.8 86.7 — .27 
13 40 56 50 71.9 83.8 79.4 — .07 
14 30 44 66 74.4 82.8 95.0 + .10 
15 28 62 72 70.2 84.5 88.7 —.11 
16 37 40 60 78.7 80.2 90.1 — .22 
17 14 45 62 53.7 75.2 87.0 + .04 
18 2 5 3 69.1 81.2 73.2 — .15 
19 2 3 1 77.7 80.9 74.4 + .23 
20 13 9 13 85.7 78.3 85.7 — .07 
21 1 3 4 76.0 81.8 84.7 + .19 
22 13 8 5 92.2 80.8 73.9 + .11 
23 4 6 10 74.5 80.4 92.3 — .03 
24 2 2 2 86 .4 86.4 86.4 — .03 
25 10 9 10 88.1 86.3 88.1 — .0l 
26 9 9 3 87 .2 87.2 69.9 — .18 
27 3 9 5 74.1 89.1 79.1 — .02 
28 12 15 14 80.8 86 .2 84.4 — .Ol 
29 37 46 62 67 .2 73.5 84.8 + .15 
30 48 76 90 63.4 81.0 89.8 —.17 
31 13 19 25 76.4 82.4 88 .4 + .01 
32 91 133 135 66 .7 83 .0 83.8 — .06 
33 64 53 60 78.5 74.2 77.0 + .42 
34 10 10 10 89.5 89.5 89.5 + .05 
35 9 7 8 85.6 81.2 83 .4 — .08 
Mean 74.8 82.0 85.4 
S.D. 8 .87 4.30 6.39 
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which should be strictly comparable with all the others. To do this 
the means and standard deviations of each of the original 35 series of 
scores were computed. With these values available it was possible 
to transmute the original series into new and equivalent ones all of 
which would have identical means and identical standard deviations. 
The transmutation was accomplished by the formula:! 


Since the statistical proportions of such equivalent series are arbitrary 
and mere matters of convenience, the mean chosen for the new series 
was 81 and theS.D.,7. This combination of values yields series closely 
resembling ordinary school marks. In order to be certain that no 
error had taken place in the 3745 transmutations thus made, the means 
and the S.D.’s of all the resulting 35 equivalent series were computed 
and found correct, 7.e., all 35 means were found to be 81 and all 35 
S.D.’s were found to be 7. 

There was now available for each subject comparable quantitative 
indices of his relative strength in each of 35 traits. Typical sets of 
such comparable trait scores for the three subjects mentioned above 
are shown in columns 5, 6 and 7 of Table II. It was accordingly 
possible at once to gather the transmuted equivalent scores of single 
individuals into the form of histograms, the shape of which would indi- 
cate in a general way the answer to our first question. A typical dis- 
tribution (subject B, Table II) is shown in Figure 1. As is to be 
expected where no more than 35 items are represented in a distribution, 
there is considerable irregularity in its contour. But even so there is 
obviously a distinct tendency to approach the characteristic shape 
of the normal probability curve. This applies in general to the trait 
distribution of all of the many other individuals whose traits were 
plotted. In order to get a clearer picture the scores of certain subjects 
having approximately the same mean of talent and degree of talent 
variability, were pooled. Three such pools were formed. These 
groups represent respectively individuals of (average trait Awariability, 


a 











———— 


individuals of comparatively slight trait variability, and individuals of 


‘extremely wide trait variability. The resulting distributions are 
‘shown in Figs. 2, 3, and 4. The contours of these distributions are 


naturally much smoother than is that of Fig. 1. With this increase in 


1 Hull, Clark L.: The Conversion of Test Scores into Series Which Shall Have 
Any Assigned Mean and Degree of Dispersion. Journal of Applied Psychology, 
Vol. VI, pp. 298-300. 
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smoothness there is apparent a still more marked approximation to 
the shape of the normal curve. This tendency seems about equally 
marked in all of the three regions of trait variability represented by 
the graphs. The indication seems to be pretty clear that the distri- 
bution of talent within an individual follows the normal law in exactly 
the same sense that distributions of individual differences do, and 
presumably with the same implications. 

We pass next to the problem of the extent or range of trait differ- 
- ences as compared with the magnitude of individual difference. It 
will be recalled that by the system of transmutation employed above, 
the S.D.’s of our individual differences for all traits alike is 7. This 
represents the S.D. of the 107 individuals’ scores on any given trait 
and is our basis of comparison. With the transmuted equivalent 
scores available it is a simple matter to determine the extent of trait 
variability. This is accomplished merely by computing the S.D. of the 
35 transmuted trait scores of eachindividual. Ifall35 traits are highly 
and positively correlated there will be little trait variability and the 
trait S.D. will approach zero. If, on the other hand, the 35 traits are 
relatively uncorrelated, there will be much variability and the S.D. 
will, on the average for all individuals, approach 7. Accordingly the 
S.D. of the trait scores for each of the 107 subjects was computed.’ 
As might easily have been anticipated from the examination of Figs. 
1 to 4, the magnitude of trait variability is very considerable. While 
it differs from one individual to another the trait(variability-average 
for all subjects taken together is 6.33 points as compared with a group 
variability of 7, A distribution of the individual differences in trait 
variability is shown in Fig. 5. 

If the 35 selections of behavior employed in the present investiga- 
; tion had all been perfect measures of the respective traits sampled and 
the traits themselves were a fair sampling of all the possible forms of 
human behavior, we could say at once that the range of talent within 


the individual is 90 per cent (A=) as great as that within a moder- 


ately selectedgroup. Asa matter of fact we know that neither condition 
is “satishied exactly though both are approximated. Our conclu- 
' gions must therefore be modified accordingly. It is well known that 





1 The extensive computations involved throughout the present investigation 
were carried out by means of an automatic correlation machine designed by the 
writer. This machine is described in the December, 1925, issue of the Journal 
of the American Statistical Association. , 











tr: 


8a 





INTCRWAL 


aT tach 


NuMetR OF TEST ScOACS 







































































Variability in Amount of Different Traits 103 
2 
F 
e+ B 
i - % 
$e ? 
5 Tl 
. 
° a4 
3 “4 Shp ; 
: ae 
i, D al 
0 bo oe + oF eo a PS te 
ScALC «OF TAANSMUTED TEST Scones SALE OF TRANSMUTED TEST SCORES 
4 tic. 1. , 
VN ate ya Fig. 2. 
oh { al | ; ~ 
wo 
2-4 ad 
Lt P 7 
2-4 : 
3 4 S on 
© « : 7 
5 
wl 2 
$ - « 3% 
7 7 ie Of - , ‘ ml 
and o Pe. am scorrs i " Sca.e i: sia tes? ae “ rr 
Fie. 3. Fie. 4. 
Fig. 1.—Trait distribution of a single subject (B, Table II) based on thirty-five 
traits. 


Fic. 2.—Composite trait distribution of six subjects all having approximately the 
same mean and the same standard deviation, the latter being in the middle range. 

Fie. 3.—Composite trait distribution of six subjects all having approximately the 
same means and the same standard deviations. The dispersions in this case are all 


extremely narrow. 


Fig. 4.—Composite trait distribution of six subjects having approximately the same 


mean and the same standard deviation. 


wide. 


The dispersions in this case are all extremely 
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a fallible measure of a function yields a larger S.D. than a perfect 
measure.! This fact by itself would tend to reduce both S.D.’s from 
their present size, presumably in about equal proportions, thus leaving 
them relatively about as before. But in addition the fallibility of the 
test scores would tend to reduce the correlation among the tests 
which in turn would make the trait S.D.’s too large as compared with 
the groupS.D.’s. This means that with perfect test scores the true trait 
S.D.’s would be smaller than found above as compared with the true 
‘group S.D.’s. <A rough calculation based upon estimated correlations 
and reliability coefficients, indicates that this factor would hardly 
more than double the difference between the S.D. found above, 7.e., 
reduce the average trait S.D. from 6.33 to 5.67. This would still leave 


the range of trait difference over 80 per cent as great 7") as that 


within a normal group. 

If this or anything closely approximating this should turn out to be 
true of vocational aptitudes, its significance as to the unrealized possi- 
bilities of vocational guidance will be profound. It means that if 
vocational choices are made with little or no knowledge of special 
abilities and disabilities, as is evidently now the case, it will be a rare 
chance, indeed, where an individual will choose the one vocation in 
which his aptitude is greatest. Indeed, by mere chance he would 
be just as likely to choose his worst. This means that the difference 
between the best and the worst choice of the average individual may 
be as much as 75 or 80 per cent as great as that between the best and 
the worst of a first year high school group on a great variety of traits, 
many of which are non-academic in nature. Such a mistake would 
be not only a personal tragedy but also a social tragedy of the first order. 
If we assume that in the chance choices of vocations the choices will 
be as likely to fall in the worse as the better half of potentialities, the 
general average personal vocational efficiency will be about midway 
between the average person’s best and his worst. What would it 
mean in terms of increased social efficiency and personal happiness if a 
system of vocational prognosis could be established which would raise 
the average quality of vocational choice from around the 50 per cent 
level where it probably now stands to the upper 5 per cent of individual 
potentialities? Or suppose it could be raised on the average from the 
50 per cent level only to the upper 20 per cent throughout an entire 





1 Kelley: “Statistical Method.” P. 213. 
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population. Even this would probably represent an advance com- 
parable to anything which has taken place in modern times. 

The third question which we set out to investigate is the extent to 
which individuals differ in the magnitude of their talent variability. 
Figures 2 to 5 already have shown the general nature of the answer. 
The results reveal a very marked difference among individuals as to 
their trait variability. The smallest variability yields a S.D. of 4.3 
whereas the greatest variability runs up to 9.09, an increase of 111 
per cent. The detailed distribution of this curious trait (if it be a 
trait) is shown in Fig. 5. This wide difference in trait variability 
would seem to indicate that the possibilities of vocational guid- 
ance may be about twice as great for certain individuals as for 
others. 
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STANDARD OLVIATION OF TRANSMUTED SCORLS 





Fig. 5.—Distribution of the trait standard deviations of 107 individuals. 


The extent of trait variability, as itself a kind of higher-order 
trait, accordingly becomes of interest. Possibilities that varia- 
bility in test response may be of significance in certain kinds of 
aptitudes at once suggest themselves. It was with a view to making 
a preliminary exploration of the situation that the relation of 
talent variability to average ability was investigated. It was thought 
that possibly variability was partly a function of average ability. 
When the two were correlated, however, it was found that there 
was no tendency whatever for the two to vary together, the coefficient 
being +.03. 

It was thought possible also that some one or more of the 35 tests 
might be at least in part a function of this interesting phase of personal 
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behavior. The trait standard deviations were accordingly correlated 
with each of the 35 tests in the hope of encountering some evidence. 
These coefficients are shown in column 8 of Table II. Of the 35 tests 
thus investigated, only one is unmistakably related to the tendency 
to individual variability. This test (No. 33) requires the subject to 
dot with a pencil consecutive squares of specially ruled paper, and 
was designed by Professor V. A. C. Henmon. 
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AN EXPERIMENT IN JUDGING INTELLIGENCE BY 
THE VOICE?! 


WILLIAM MICHAEL 
AND 


C. C. CRAWFORD 
University of Idaho 


The hypothesis that intelligence can be measured by observing the 
inflection of the voice was suggested by a number of widely different 
observations. For example, it was observed that in teaching classes 
in public speaking it was possible to change or improve breathing habits, 
voice quality, and the control of force, but extremely difficult to change 
inflection habits. Again, persons have been observed who possessed 
very limited vocabularies, but who were able to express very complex 
meanings by the inflection of the few words they used. For example, 
the exclamation ‘“‘Oh,” may be so inflected as to mean “ You will, will 
you,” or “Well, I never thought of that.” This use of inflection in 
conveying ideas explains why it is sometimes possible to get the essen- 
tial meaning of conversation in a foreign tongue even without being 
able to understand any of the words that are spoken. In fact, it is 
entirely possible that, in the evolution of language in the race, differ- 
ences in pitch variation were used to convey ideas before differences in 
quality or resonance or vocabulary were evolved. If so, then inflection 
must be considered as more a part of the original or native behavior of 
the individual than vocabulary. The idea behind the intelligence test 
used in this experiment 1s, therefore, to go back to this fundamental 
form of behavior rather than to work with the more completely 
acquired behavior involved in tests which depend upon vocabulary. 

A final consideration which seems to justify measuring intelligence 
by the voice is that behavioristic psychology has so thoroughly identi- 
fied thought and the mére intelligent types of behavior with the 
language mechanisms. Yet the usual type of intelligence test has 
measured these language responses through reading and writing, which 





1 The investigation reported in this article is the result of the cooperation 
between a specialist in public speaking and one in education. The specialist in 
public speaking is responsible for the origination of the problem and for the actual 
judging of the subjects, while the specialist in education is responsible for the 
methedology and experimental technique, and for the formulation and evaluation 
of the results, 
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are wholly acquired forms of behavior, instead of through spoken 
language, which is much more nearly native, or pure inflection, which 
is even more native and fundamental than spoken vocabulary. 

Let us define the term inflection as change in pitch within a syllable, 
word, phrase, sentence, or paragraph in harmony with the idea which 
the given speech unit is intended to convey. The hypothesis which 
this experiment was designed to test was that the intelligent person 
would do two things: (1) Vary the inflection of his voice over a wide 
range, and (2) control his inflection so that the rises and falls would 
correspond to the true meaning of the passage spoken; whereas the 
unintelligent person would speak in a monotone or raise or lower his voice 
at the wrong time and without due regard to the sense of the passage. 

Five types of persons have been observed in connection with public 
speaking work as regards their inflection. The first is the monotone, 
who seems to have no appreciation of tone change asrelated to meaning. 
The second has more appreciation of the value of tone changes on the 
lesser speech units, but lacks the power to discriminate; and his pat- 
terns tend to involve too regular and too recurrent rhythms. The 
third type has considerable sense of the necessity of tone change, but 
tends to overdo it by making the intervals greater than required. The 
fourth type has the sense of the need of tone changes possessed by the 
third and, in addition, has a sense of economy. He does not overdo 
tone changes and his inflections are more subtle. He falls down, 
however, on the larger speech units and in the suggestion of sustained 
moods. His shortcoming will not show when he is reading material 
which involves purely denotative language, but will show in the case of 
connciative language, suchas poetry. The fifth and highest type has 
all of the virtues and none of the faults of the other types. He 
modulates accurately and appropriately on the lesser speech units, and 
is also able to compose the more complex tonal patterns necessary to 
the suggestion of the sustained mood or complex idea. 

To summarize, the inflectional patterns of different persons com- 
prise two variables: (1) Sensitivity to the need of tone change, and (2) 
economy, or freedom from random movement in making the changes. 
The five types outlined above represent the principal combinations 
met with, and have been made the basis of the ratings reported in this 
study. The boundaries between the types are not clearly and sharply 
defined, but represent real distinctions nevertheless. 

In rating the students as to inflection for the purpose of determining 
correlations between inflection and intelligence, it was thought desir- 
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able to rate a number of other qualities of speech at the same time, in 
order to utilize the findings as by-products of the main investigation. 
Each of eight voice factors was, therefore, rated on a five-point scale in 
the same way as was inflection. The nine qualities rated were as 
follows: (1) Inflection, as described above. (2) Normal quality, or the 
degree to which the person had equalized or balanced resonance in the 
production of a normal tone. (3) Pitch accuracy, or the ability to 
reflect the purely denotative meaning of the lesser speech units in 
pitch changes. (4) Key sense, or the ability to adjust pitch changes to 
the larger units of speech so as to maintain a sustained suggestion of 
mood (both pitch accuracy and key sense are included in the term inflec- 
tion). (5) Force sense, or the ability to control breathing and the 
expulsion of breath effectively. (6) Enunciation, or the clearness and 
distinctness with which the vowels and consonants are produced. 
(7) Rate and phrasing, or the ability to represent meaning by proper 
speech punctuation. (8) Accompanying physical activity, or the degree 
to which the total bodily activities are coordinated with the purely 
vocal activities, or, in other words, the degree to which random move- 
ment iseliminated. (9) Use of language, or the ability to recognize and 
pronounce words. 

Ratings of the above list of voice qualities were made by having 
students read orally three selections in the presence of the judge, who 
was a public speaking teacher. The passages read were Hamlet’s 
Advice to the Players, a selection from R. L. Stevenson’s Apology for 
Idlers, and one from R. G. Ingersoll’s oration at his brother’s tomb. 
The three passages combined included about 500 words. These 
passages were chosen because they call for variety in inflection and are 
difficult to read. They give ample opportunity to display one’s vocal 
capacities. The only directions given were to go to the front of the 
room and read the passages as well as possible, and without stopping 
to ask the pronunciation of words or any other questions. 

Fifty-six students were rated. They were selected from the 
classes taught by the writers of this report, with only two restrictions. 
These were that only second-year students were chosen whose records 
for intelligence tests were on file. ‘The previous scholastic standings 
and intelligence scores of the students chosen were unknown to the 
judge at the time the ratings were made. Of the 56 students rated, 
27 were boys and 29 girls. The judge was acquainted with 30 out of 
the 56 while 26 were complete strangers to him. The intelligence 
test scores which were available were from the Thurstone Test, pub- 
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lished by the National Research Council, and used for the year 1924— 
25. The score used was the Median Percentile Rank, the only general 


average available for the test as a whole. 


TaBLE 1.—INTERCORRELATONS OF THE NINE Voice Factors ror 56 STuDENTs 


AND FOR 26 UNKNOWN STUDENTS (THE CORRELATIONS FOR THE UNKNOWNS 
ARE IN ITALICS) 
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Inflection........... .34| .67 | .14| .57) .86 | .40 | .27 | .52 
.22| .86 | .36 | 68| .48 | .80| .28 | .47 
Normal quality......... Se | 33 | .28| .21| .49] .30| .41| .48 
: 5 .80 | .84 |—.09| .82 | .19| .05 | .85 
Pitch accuracy......... .67 .33 .33 .36) .44 | .41 | .34 1] .60 
.86 -30| , 11 47) .36 | .84 1 .83 | .62 
| 

Key sense.............. .14| .28| .33 25} .16 | .21 | .50 | .17 
.36 84) .11 21; .18 | .18 | .84 | .80 
eee .57 . 2 Se 2 eee .26 | .28 |, .36 | .30 
68 |\—.09| .47 | .81|..... 81 | .26 | .86)| .41 
Enunciation............ .36 | .49) .44|] .16 | .26 .381 | .29 | .58 
.43 .82| .86 | .13 31 .02 | .01 | .82 
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Accompanying physical} .27/| .41) .34/| .50 36 .29 | .50 .47 
activity. .28 .06| .83 | .34 .36| .01 | .48 .82 

Use of language.........| .52 .48| .60 | .17 .30| .58 | .86 | .47 

47 .85| .62 | .80 .41| .82 | .28 | .82 





























The results of the ratings may be presented under three divisions, 


(2) 


as follows: (1) Intercorrelations between the nine voice factors. 


Correlations of the nine voice factors with scholarship and intelligence. 
(3) Multiple correlations in which certain factors are combined and 
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correlated with certain other factors. These will be discussed in turn. 
Since the results might be expected to differ somewhat in the case of 
students who were known to the judge, as contrasted with students 
who were complete strangers, separate calculations have been made 
for the unknown students alone. This obviates the possibility that 
positive correlations found to exist are due to judging on the basis 
of previous knowledge about the subjects instead of by inflection and 
vocal factors alone. 


TaBLeE II.—CorRELATIONS OF THE NINE Voice Factors Wits SCHOLARSHIP 
AND INTELLIGENCE (CORRELATIONS FOR INTELLIGENCE IN ITALICS)’ 
































26 
30 10 16 
56 un- 17 13 
Nine voice factors stu- Po 2 known known | known | known | , ee nel 
dents| -°%* girls | stu- stu- boys girls pany “7 ~ 
dente dents - ai 

Pts ccasnesebetas .47| .57 . 38 .40 .57 .60 | —.10 .46 .66 
.84| .61 17 41 .63 .61 | —.22 .68 .60 
Normal quality........... 21) .35 .00 15 .38 .24 | —.25 47 .31 
.02| .838 |—.16| —.08 .18 .14 | —.28 .62 .67 
Pitch accuracy............ . 34) .44 .18 .28 .52 .38 | —.04 .51 .44 
.01| .27 |—.25| —.17 .23)\) .18 | —.68 .60 .21 
ec ceniddee ean . 24) .31 . 34 .20 .37 -26 | —.15 .47 .36 
—.09| .06 |\—.18| —.30 10; —.12 | —.46 .06 — .01 
PORSR GING. on ccccuscccece . 33) .45 15 .30 .40 .45 | —.22 .42 .40 
.16| .36 .47 17 .48 | —.01 | —.34 .68 82 
Enunciation.............. .08| .17 .19} —.02 11 -16 | —.47 .22 .04 
.15| .29 .08 .02 84 19 | —.09 .70 a 
Rate and phrasing......... .27) .45 .02 .30 .23 .50 | —.03 .31 .20 
—.02, .20 |\—.20| —.06 | —.02 10; —.18 .40 — .84 
Accompaning physical ac-| .25) .29 .06 .22 .17 .30 | —.26 . 26 .34 
tivity —.08| .18 |—.22 26 .06 | —.06 | —.42 . 86 — .04 
Use of language........... .34| .59 .09 .25 .46 44) —.18 .39 .41 
.25| .57 08 .02 46 .41 | —.29 .69 .87 
































Table I presents the intercorrelations between the nine voice 
factors for the whole group and also for the students who were unknown 
to the judge at the time of the ratings. Two things are to be noted 
especially. First, the factors are positively correlated, there being 
only one negative coefficient in the table and it unimportant. Second, 
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the factors which correlate most highly with inflection are pitch accu- 
racy, force sense, and use of language. If the reader will keep this 
in mind he will note later on that these are the factors which have the 
highest correlation with scholarship. 

Table II gives the correlations of the nine voice factors with scholar- 
ship and intelligence respectively. Examination of the correlations 
with scholarship shows uniformly positive correlations except for the 
13 known girls, in whose case the coefficients are uniformly negative. 

This peculiarity is due in part to the difficulty of judging familiar 
voices, but since it did not occur in the case of known boys we must 
seek for some other additional cause. The best we can do is to say 
that it is due to chance or to special causes unknown—probably the 
former, since the number of cases is so very small. 

The correlations with intelligence are distinctly lower than with 
scholarship, and in many cases negative. This is due to the fact that, 
with the exception of inflection, the voice factors are largely acquisi- 
tions of the same general nature as scholarship, and are more like it 
than like intelligence. The situation shown in the table may be 
summarized by saying that inflection consistently yields positive 
correlations of substantial size with both scholarship and intelligence 
for all boys and for unknown girls, but that the other voice factors fail 
to correlate with either scholarship or intelligence very highly. 

Let us now turn to the results of the use of multiple correlations in 
the study. We have found three variables with substantial inter- 
correlations; namely, scholarship, intelligence, and inflection. Our 
problem is, then, to predict one of these from a combination of the 
other two, or vice versa. By means of multiple correlation we can 
determine what correlation would result if we were to combine the 
scores in two variables in the optimum manner and correlate the com- 
bined score with the third variable. Results of such computations 
are shown in Table III both for the entire group and for the unknowns. 
The result is that in only two of the six multiple correlations is there 
any appreciable increase because of the combinations. One of these 
is when scholarship is combined with intelligence and correlated with 
inflection in the case of unknown students, causing a rise from .57 to 
.64. The other is when intelligence and inflection are combined and 
correlated with scholarship for the whole group, causing a rise from 
.O1 to .60. Neither of these increases is great enough to be statistically 
reliable, and it appears, therefore, that there is not much advantage 
in making the combinations. 
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Taste IIIl.—Simmete anp Mouttipte CorrELATIONS INVOLVING THE THREB 
VARIABLES, INFLECTION, SCHOLARSHIP, AND INTELLIGENCE (COEFFICIENTS 
FoR UNKNOWNS IN ITALICS) 























Inflection | Scholarship | Intelligence 
NS is iininia ct eeehed in esd ccna <a 47 34 
.67 58 
tee ok b wade 47 jae .51 
.67 vee 44 
I ere ee eee .34 51 
.63 44 
Intelligence and scholarship combined... . .49 
64 
Intelligence and inflection combined..... . she .60 
.68 
Scholarship and inflection combined..... . ath: pits .53 
.56 














Finally, we shall note the results of multiple correlations in which 
the several voice factors were combined with inflection and correlated 
with scholarship, in order to see whether the addition of any of these 
factors would increase the correlation. The results are emphatically 
negative. For the eight factors, in the order listed in the tables, the 
results, in the case of all 56 students, were .47, .49, .50, .47, .47, .48, 
.49, and .48 respectively, while inflection alone gave .47. For the 
26 unknown students, inflection alone gave .57, and when the other 
factors were each combined with inflection the coefficients were 
.61, .57, .60, .57, .59, .57, and .60 respectively. In other words, there 
was no significant increase as a result of the combinations. Direct 
inspection of the simple correlation coefficients reveals that the same 
thing would result in the case of predicting intelligence from inflection. 
The scores for inflection alone are therefore as good for predicting 
scholarship or intelligence as combinations of scores for inflection and 
the other voice factors. 


SUMMARY AND CONCLUSIONS 


It has been found that inflection, or the pattern of pitch changes in 
the voice, is a reasonably good measure of ability. 
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The correlations between inflection and scholarship are approxi- 
mately the same as those between intelligence tests and scholarship. 

Inflection correlates with scholarship slightly better than it does 
with intelligence tests. 

The correlation of inflection with intelligence and with scholarship 
is higher in the case of unknown students than in the case of students 
known to the judge. The difference is due in part to the difficulty 
of judging inflection in familiar voices. 

The various voice factors are positively correlated one with the 
other, and also positively correlated with scholarship; but inflection 
is the only one of the voice factors which has any important correla- 
lation with intelligence. 

The three factors, scholarship, intelligence, and inflection, are 
about equally intercorrelated, and any one of the three is about 
as safe a basis for predicing another as any two combined. 

There is little or no advantage to be gained by combining measures 
of other voice factors with inflection, since correlations with intelli- 
gence or scholarship are not raised appreciably as a result. 
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PERIODICITY AND PLAY BEHAVIOR 
HARVEY C. LEHMAN AND PAUL A. WITTY 
The School of Education, The University of Kansas 


Undue emphasis upon periodicity in play behavior has resulted in 
the more important characteristic of play behavior, namely, its con- 
tinuity, being obscured or under-estimated. Of periodicity and 
rhythm in play, as in all development, there can be no doubt. But 
any thoughtful attempt to characterize a particular period must bring 
the conviction that each stage merges into the succeeding one and that 
the obvious characteristic traits of each period have their beginnings 
in preceding stages and merge gradually into succeeding ones. 

Team play and social participation in play behavior have been 
emphasized as characteristic of certain periods of development. 
Individualistic play, too, has been designated to be characteristic of 
certain other periods of development. This paper attempts to ascer- 
tain salient data regarding periodicity in play behavior by: (1) Deter- 
mining the number of play activities engaged in by representative 
pupils of chronological ages 7 to 19 years inclusive; (2) obtaining for 
the same children indices of social adaptation. 


METHOD 


The Lehman Play Quiz, devised for children in Grades III—XII, 
was administered to over 6000 children in these grades in the public 
schools of Kansas City, Missouri. The children are asked to indicate 
among a comprehensive and catholic list of 200 play activities only 
those in which they had engaged during the preceding week. The 
children are later asked to designate those activities in which they 
participated alone. 

For each child the total number of play activities engaged in during 
the preceding week was ascertained. The number participated in 
with one or more additional children was next determined. The per- 
centage of the total activities that the social activities represented was 
designated the index of social adaptation. Thus an index of social 
adaptation of 80 indicates that 80 per cent of the activities engaged in 





1Lehman, Harvey C.: I. The Play Activities of Persons of Different Ages. The 
Pedagogical Seminary, Vol. XX XIII, June, 1926, pp. 250-72. II. Growth Stages 
in Play Behavior. Op. cit., pp. 273-88. 
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by the child were ones in which one or more other children also took 
part. 

Table I shows: (1) The number of children of various age levels 
included in the study, (2) the mean indices of social adaptation for 
the children in each age interval, anc (3) the mean number of activities 
engaged in at each age level. 


RESULTS 


Mean Number of Activities in Which Children Participate—Figure 
1 shows the mean number of play activities engaged in by the children 
of the various age levels. It was found that the younger children 
engaged in a larger number of activities than did the older ones. This 
may be due to the limitations of the quiz, the list having been made 
for younger children. On the other hand, because of increased 
demands upon their leisure time, older children may be denied oppor- 
tunity for versatility in play. 

Figure 1 shows also that the transition from age to age is very 
gradual. It brings out clearly the fact that there are no single age 
levels at which the diversity of interests suddenly decreases or increases 
by spurts. 


TaBLE I.—Putay Data ror 6886 CHILDREN 











Chronological . Mean index of Mean number of 
Frequencies ‘ : activities 
age social adaptation . 
engaged in 
7% 84 62.01 44.26 
8% 468 63.25 40.56 
9% 935 61.70 42.37 
10% 981 60.58 37 .67 
11% 748 58.12 36.86 
12% 903 55.69 34.01 
13% 946 55.65 31.52 
144% 848 52.92 28.58 
15% 573 52.28 27.45 
16% 288 50.56 25.91 
17% 82 52.04 24.93 
18% 25 52.32 25.50 
1914 5 57.50 25.50 
et oc nsids ath 6886 

































Periodicity and Play Behavior 
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Fig. 1.—Relationship between chronological age and number of play activities in which 
children participate. 








—— 
= 


i ae ° 


Index of 
Social Adaptation 
100 





fe wie 
he tint 


Ni Satine 


80 F 








i 4 4 4 , i a my 2 


@ e ae 
C.Ah.-=--7h Se 9 %10b lt 12e 136 146 15% 164 174 16% 192% 
Fie. 2.—Relationship between index of social adaptation and chronological age. 








eee 
SS Ree Re ee 


118 The Journal of Educational Psychology 


Indices of Social Adaptation.—Diagram 2 shows the mean index of 
social adaptation for groups of children of various age levels. Very 
large individual differences exist at every age level in this regard. As 
compared with the individual differences found at the various age 
levels, the mean differences between the various age levels are not 
especially significant. 

It is, of course, true that Diagram 2 presents only partial data 
regarding play behavior. There is probably a quality as well as a 
quantity in social participation. The real differences with respect 
to social adaptation may therefore be qualitative rather than quanti- 
tative. In the latter case, the real age differences with respect to 
social adaptation could only be ascertained by the method of a detailed 
psychological analysis of how persons of various age levels participate 
in their recreational activities. The subjective nature of such analyses 
make them difficult and of questionable validity. Too, the enormous 
individual differences that exist among the members of a group of the 
same chronological age make doubtful the advisability of a program to 
discover such tendencies. 


CONCLUSIONS 


1. Attempts to differentiate certain chronological age periods in 
terms of differences displayed by children in diversity of play activities 
during these periods seem unjustifiable. 

2. The play trends which characterize a given age group seem to 
be the result of gradual changes occurring during the growth period. 
These changes are not sudden and characterized by periodicity but 
are gradual and contingent. 

3. Nor can any age or group of ages between 714 and 19)% inclu- 
sive be characterized by play behavior primarily social or primarily 
individualistic. Diagram 2 shows clearly that such a practice is 
unwarranted. 














COACHING FOR INTELLIGENCE TESTS 


M. E. GILMORE 


Assistant Director, Canton Normal School, Canton, Ohio 


In this study an attempt was made to determine how much a 
student’s score in intelligence tests might be increased as a result of 
coaching. 

Sixty-four people taking the first year of a teacher’s training course 
were selected for the experiment, 32 for a control group and an equal 
number for the experimental or coaching group. These persons were 
chosen at random for these groups without any attempt to select 
them with reference to brightness or dullness or previous rating. As 
a result the groups were very evenly divided with reference to point 
score, as shown by the first test which all took at the same time. 

The test used as a basis for the experiment was the Otis group 
intelligence scale (opposites, disarranged sentence, proverbs, analogies, 
similarities). This test was chosen because of it being representative 
of intelligence tests, as well as one that is constructed so as to measure 
more accurately advanced students. It was quite necessary to keep 
beyond the limits of these students as much as possible. Similar 
materials were chosen from other tests. 

After the first test was administered to both groups at the same 
time the members of the experimental group were started on a sys- 
tematic coaching plan. Typewritten sheets were prepared for each 
student with material similar but not identical to that of the basic 
test. Two sheets of each of the five tests used were prepared, studied 
by the student, and finally checked for right and wrong answers. 
By this method a reasonably accurate check could be made as to 
the time and intensity of study each member of the group gave to the 
material. This coaching extended over a period of 12 weeks. The 
material was given out and checked at quite regular intervals. 

After this coaching was completed the basic test was again admin- 
istered in the regular way to both groups at the same time and the 
results tabulated. The table shows the results of the experiment for 
each individual in each group and the corresponding results of the 
groups as a whole. 

For convenience, the names of each group were arranged alpha- 
betically on the summary sheet and are numbered 1, 2, 3, 4, 5, 6, 7, 
etc. on the chart. The point scores are paralleled for each student in 
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TABULATION OF ReEsuLts (SUMMARY) 














Control group Coached group 

Test Test | Points| Per Test Test | Points| Per 

I II —or +! cent I II —or +! cent 

1 83 87 + 4 4.81 1}; 92 101 + 9 9.78 
2 96 110 +14 14.58 2; 60 79 +19: 31.66 
3 84 89 + 5 5.95 3 | 99 104 + 5 5.05 
4 85 91 + 6 7.06 4) 70 98 +28* | 40.00 
5 85 92 + 7 8.23 5 | 96 102 + 6 6.25 
6 73 87 +14 19.17 6 | 101 105 + 4 3.96 
7 82 96 +14 17.07 7; 90 102 +12 13.33 
8 87 96 +9 10.34 8; 69 99 +30 43.48 
9 81 77 — 4 4.93 9; 94 103 +9 9.57 
10 88 102 +14 15.91 || 10 | 95 99 + 4 4.21 
11 81 95 +14 17.28 || 11 | 73 93 +20 27 .39 
12 81 94 +13 16.04 || 12} 85 102 +17 20.00 
13 85 96 +11 12.94 || 13 | 94 105 +11 11.70 
14 79 93 +14 17.72 || 14 | 95 107 +12 12.63 
15 108 lll + 3 2.77 || 15 | 100 105 + 5 5.00 
16 89 104 +15 16.85 || 16 | 76 110 +34. | 44.73 
17 88 95 + 7 7.95 || 17 | 90 105 +15 16.66 
18 106 108 + 2 1.88 || 18 | 87 101 +14 16.09 
19 71 80 +9 12.67 || 19 | 93 107 +14 15.05 
20 86 85 —1 .86 || 20 | 83 95 +12 14.45 
21 86 86 0 .00 || 21 | 96 100 +4 4.17 
22 108 101 — 7 6.48 || 22 | sgl 99 +18 22.22 
23 90 100 +10 9.00 || 23 | 8&8 103 +15 17.04 
24 99 99 0 .00 || 24 | 80 95 +15 18.75 
25 80 89 +9 11.25 || 25 | 86 103 +17 19.76 
26 75 89 +14 18.66 || 26 | 49 85 +36 73 .46 
27 78 85 + 7 8.97 || 27 | 92 104 +12 13.04 
28 89 97 +8 8.98 || 28 | 70 88 +18 25.71 
29 95 102 +7 7.36 || 29 | 76 94 +18 23 .68 
30 74 89 +15 20.27 || 30 | 76 95 +19 25.00 
31 79 85 + 6 7.59 || 31 | 68 84 +16 23 .52 
32 78 87 + 9 11.54 || 32 | 79 109 +30. | 37.97 

Total (2749 (2997 7 pre . 12683 (3181 498 

Average} 85.91) 93.68) 7.75) 9.02 83.84) 99.41) 15.56 | 18.56 



































Increase of coached group over control group = 9.54 per cent. 
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both tests for both groups as the table shows. By referring to the 
table one can readily see the points made by each student in each group 
for each test. The points increased or decreased as well as the per 
cent of increase or decrease for each student of each group in each test 
are likewise shown. Totals and averages are also tabulated. It can 
readily be seen that the coached group increased 9.5 per cent over the 
control group as a result of the coaching. It can also be noticed that 
the control group increased a certain per cent as a result of repeating 
the test. 

Conclusion.—In an experiment of this kind a few non-controllable 
factors are apparent. The most noticeable one undoubtedly is the 
inability to regulate only in a general way the intensity and amount of 
individual attention and study given to the coaching material by the 
individuals of the group. By observing the table one would judge 
that the individual study of the material did vary. However, the 
following points are quite conclusively shown by the experiment: 

1. That students can be coached to the point of increasing their 
standing and score in intelligence tests even in case of the material 
being only similar and not identical with that of the basic test. 

2. That the increase varies in proportion to the amount and inten- 
sity of the coaching, if the whole process be continuous and without 
lapse of too much time. 

3. That the students making the low scores make the greatest 
per cent of increase, i.e., the lower the score the greater per cent of 
increase other things being equal. An examination of the table is 
sufficient evidence that this is generally true. 

4. That a control group makes a substantial gain as a result of the 
repetition of the test. 

5. That material for intelligence tests should be non-coachable if 
it is to measure accurately and fairly a degree of intelligence. 

6. That there is danger in the use of similar and identical material 
of different intelligence tests for a fair rating unless it is certain that 
the student had not at any previous time taken a test or come in 
contact with similar or identical material in some purposeful way. 














A FORMULA FOR CORRELATING INTERCHANGE- 
ABLE VARIABLES 


PAUL HANLY FURFEY 


Catholic University, Brookland, D. C. 


In the usual problem of correlation, the nature of the data will 
determine which variate in a given pair is the z-variate, and which is 
. the y-variate. Problems will occasionally arise, however, in which 
this is not so. Suppose, for example, it is desired to correlate the 
1Q’s of a series of twins, one against the other. In each pair of IQ’s 
there is no way to determine which score should be considered the 
x-variate, and which the y-variate. The variables are quite inter- 
changeable throughout. In a series of N pairs there are 2”—! possible 
arrangements and these different arrangments yield correlations 
which are, in general, different. 

For example, consider a problem involving the correlation of the 
IQ’s of six pairs of twins. Let the scores be arranged as follows: 


2-VARIABLE y-VARIABLE 


ee Tries dias ies a7 ads ad a dd ey Seis be 68a ewe aes 101 111 
ee. Be se ee eke ed bebe ekeeaeuneed 102 112 
es bees ee Sk ok eels aly wid wines wale 103 113 
a i ce eo et a ee eel eel ee od 104 114 
EE I a ee a ee NS eee ere 105 115 
ii i a ie i a ee i ee eb 106 116 


The correlation is evidently 1.000. But this is only one of 32 
possible arrangements. In Pair I there is no reason why 101 should 
be written in the first column rather than 111. In Pair II we might 
have considered 112 as the z-variate instead of 102. So for the other 
pairs. The following arrangement, therefore, will be equally legitimate: 


2-V ARIABLE Y-VARIABLE 


ES a id aay ene. a8 oe ae ee Peg ON LF 111 101 
ES Er Oe rer | 2 te Ore cs” - 412 102 
es: 6.c aca these Oe eds sean tekeslsbetaeaes 113 103 
anid in inh deb eid kaa pall meet kmweee 104 114 
NN in os chen essehdcsddemaad heiseedabteneas 105 115 
IRE. 26 6 5a cnd bw ak bse eie badaee weeen ine 106 116 


With this arrangement the correlation drops abruptly from 1.000 to 
— .938. Evidently the uncritical use of the correlation coefficient in 
such cases can lead to absurdities. 
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Which of the various possible values of r represents most nearly 
the facts of the case? The problem is, to find an r which shall express 
the tendency of twins to agree in intelligence. It should express the 
tendency for a randomly selected twin to have a twin sib of like IQ. 
The true value, therefore, should come from a random arrangement. 
The first arrangement in our illustrative example, in which the brighter 
twin is systematically put first, evidently does not represent such an 
arrangement. The resulting r really expresses the tendency of the 
brighter twin’s IQ to vary with his duller sib’s IQ and vice versa. As 
an index of agreement between the IQ’s of twins it is evidently entirely 
spurious. All the pairs in the example are separated by 10 points on 
the IQ scale. 

It seems fairly evident that the true r in such cases is the one 
resulting from a random arrangement. But how can this arrange- 
ment be secured? To consider the general case it is convenient to 
write the correlation formula in the following form which may readily 
be derived from Kelley’s! formula (94): 

N=IXY — TXZTY 
T= nace ———. 
V(N 2X2 — (2X)*[N ZY? — (TY)?} 
The term NZXY evidently remains constant no matter how the 
members of the individual pairs are interchanged. Further since, 


=X + ZY = constant, >X? + ZY* = constant, 
We may write, 











r= f (2X, =X’). 


That is to say, no matter how the variables are interchanged, the 
different arrangements can affect the value of r only insofar as they 
affect the terms 2X and =X*. Sor may be studied as a function of 
>~X and 2X? alone. 

Now since 2X represents the sum of a sampling of N of the 2N 
variates, it is fairly evident that =X will approximately equal ZY 
when N is not small, if the arrangement is a truly random one. In 
the same way a random arrangement will tend to give equal values 
to >X2and =Y*. The criterion of a random arrangement, then, will 
be that 2X shall equal [TY and 2X? shall equal ZY’. 

Where the population is large and no systematic error is introduced 
this criterion will be approximately fulfilled and r will tend automati- 
cally to approach the correct value. The danger comes from the 
possibility of systematic errors, as would be the case, for example, if 


1T. L. Kelley: Statistical Methods. Macmillan. 
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an investigator were unaware of the fallacy involved and should uni- 
formly write in the first column the IQ of the brighter of two twins. 
The spurious increase in r in such a case may be very considerable. 
It will be easy to derive a new correlation formula which will insure 
the correct value for r and which will simplify somewhat the work of 
computation. 

To insure the condition for random arrangement, namely, that 
~X shall equal ZY, and 2X? shall equal ZY?, it is only necessary to 
‘ increase the number of pairs from N to 2N by writing each pair twice, 
first with one score and then with the other score as the z-variable. 
In practice the same result may be obtained by substituting 2N for 
N, (2X + ZY) for 2X and ZY, and 22XY for [XY in the correla- 
tion formula giving: 

4NZIXY — (2X + ZY)? 





"= 2N(2X? + 2Y*) — (2X + SY)? 
which would appear to be the correct formula to use. 








THE SPEARMAN PROPHECY FORMULA 


C. 8. SLOCOMBE 
Lincoln School, Teachers College, Columbia University, N. Y. 


Within the last four years there have been four articles in this 
Journal discussing empirical studies of the Spearman formula for 
estimating ‘‘reliability correlation.’”’ In the earlier of these studies 
the formula was misapplied, and prediction found to be erratic. 
As however the use of the formula has been correctly demonstrated 
by Kelley,! and by Ruch, Ackerson, and Jackson,? it appears neces- 
sary to point out why the last named authors did not obtain accu- 
rate prediction. 

It is to be noted that they have recognized (Holzinger* apparently 
did not) that there are a very large number of intercorrelations of 
unitary tests, and that theoretically all these should be used in obtain- 
ing the average on which the prediction is to be based. As the 
calculation of all these coefficients is exceedingly laborious, an approx- 
imation may be reached by taking a sample of them; though if this is 
done, the chance of error in prediction is increased, or rather made 
possible. 

The selection of the sample of coefficients for calculation of the 
average may be made in one of two ways: (a) By random sampling; 
(6) by deliberately including in the sample such coefficients as will 
make it a representative one (apart from chance variations). Ruch, 
Ackerson, and Jackson have elected the second method but not noting 
a certain regular variation have failed to obtain a true sample of 
coefficients. 

It has been found on numerous occasions that, in repetitions 
of one form of test, the correlation between performances at short 
intervals is on the average higher than that at long intervals; generally 
that the coefficient is inversely proportional to the length of interval, 
measured in hours, days, etc. This appears to be due to the effect of 
practice, and in fact the presence of such regular variation is an excel- 
lent indicator of practice.‘ 

In the table presented by Ruch, Ackerson, and Jackson (p. 310, loc. 
cit.) are the coefficients they have used in obtaining the average. In 
this table the average of coefficients at an interval of one (day?) = 
.85; the average at an interval of ten (days?) = .83. The difference 
between these averages is small, and sufficient only to indicate the 
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possible presence of this effect (and to illustrate the point). It is to be 
noted that the interval is not a maximum one, and that the presenta- 
tion of fuller data would almost certainly show a greater difference. 
The average which they have used is thus weighted by the inclusion of 
high correlations at short intervals, and the absence of low cor- 
relations at maximum intervals. Hence being higher than the true 
average, it yields an over-prediction—particularly in the earlier 
portion of the curve (p. 312, loc. cit.). 
. The value of the predicted correlation as given in column 2 (p. 311, 
loc. cit.) of their article (.847) appears to be a false one, and should 
not be represented in the curve. The predicted correlation of the 
first test with itself is either 1.00, or it must be obtained by splitting 
the test in halves, and applying the Spearman formula in question. 
In either case it will exceed .847. 

That is to say, the over-prediction observed is in the cumulation of 
the first ten tests. The practice effect in these tests has not sufficient 
influence to raise the “‘obtained correlation”’ to the “‘ predicted correla- 
tion” which is calculated from an average influenced by practice 
throughout the whole series of twenty tests. 


CONCLUSION 


1. Great care is necessary in the use of this formula, in view of the 
almost certain presence of systematic changes (due to practice and 
fatigue) in the repetitions of single test forms, and of the Binet tests. 
Such changes are generally not present when a battery of forms is 
repeated. 

2. Ruch, Ackerson and Jackson, by their method of selecting 
a sample of coefficients, did not eliminate the effect of this practice, 
and so obtained some degree of over-prediction. 

3. Where a sample of coefficients is to be used in place of the 
theoretically correct total, it is suggested that either (1) the selection 
be random, or (2) the coefficients be selected from the pairs 1.20; 
2.19; . . . 9.12; 10.11, and not from the pairs 1.2; 3.4; ... 
17.18; 19.20. 


1. Journal of Educational Psychology, 1925, p. 300. 

2. Journal of Educational Psychology, May, 1926, p. 309. 

3. Journal of Educational Psychology, 1923, p. 302. 
Journal Educational Psychology, 1925, p. 289. 

4. British Journal of Psychology, Oct., 1926. 
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AN ABAC FOR FINDING THE STANDARD ERROR OF 
A PROPORTION AND THE STANDARD ERROR 
OF THE DIFFERENCE OF PROPORTIONS 


HAROLD A. EDGERTON 
Ohio State University 


In order to facilitate the computation of the standard error of a 
proportion and the standard error of the difference of proportions, the 
accompanying abac was devised. 

The standard error of a proportion is 


op = ay 


where p is the proportion of responses in question, g = 1 — p, and N 
is the total number of cases. 


The standard error of the difference of proportions is 


P14 er ae 
Tp.-p: = . 7+ Ne = Vo "p17 Pe 


By use of the abac the entire equation from p; and pz to the ratio 
of the difference to the standard error of the difference may be solved. 








DIRECTIONS FOR USING THE ABAC 


It is necessary to know only p: and pe, the two percentages or 
proportions to be compared, and N, and N2, the total number of cases 
in each sampling. 

1. Find the line p along the axis labeled “ proportion p.”’ 

2. Follow up this ordinate to the point where it intersects the 
diagonal line N. — 

3. The value of c, can be read on the scale at the left hand edge of 
the abac. 

If the standard error of the difference is desired: 

4. Go from the intersection of p and N horizontally to the right to 
scale A. 

5. Transfer the value found on scale A to scale C. 

6. Use pe and N, as in 1, 2 and 4. 

7. Connect the value on scale C (found from p; and N;) with the 
value on scale A (found from pz and N2) by a straight line. 

8. The standard error of the difference is read from scale B at the 
intersection of scale B and the straight line connecting the points on 
scale A and scale C. 
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To find the ratio of the difference (p: — pz) to the standard error 
of the difference: 

9. Transfer the standard error of the difference from scale B to 
scale D. 

10. Connect this point on scale D with the point on scale E having 
the value of pi — po. 

11. Project this straight line to scale F. The intersection of this 
line with scale F gives the ratio of the difference to the standard error 
- of the difference of proportions. 

A rubber band seems to serve better than a straight-edge for 
connecting points on the scales. 

In case the difference (p; — p2) is less than the standard error of 
the difference, the ratio can be found by using 10 times the value of 
the difference on scale E and the ratio thus found is read as one-tenth 
that found on scale F. 

If one has a number of problems to solve, using this particular 
formula, in which N; or N; is the same for a number of percentages, 
time can be saved by clipping a strip of paper along scales A and C 
showing the value of the oc, of the varying per cents for which N is 
the same. 
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NEW PUBLICATIONS IN EDUCATIONAL 
PSYCHOLOGY AND RELATED FIELDS OF 


eh EDUCATION ee 


CONDUCTED BY JOHN HOCKETT 
The Lincoln School of Teachers College 











CAUSES AND PREVENTION OF DELINQUENCY 


Delinquents and Criminals: Their Making and Unmaking, Studies in 
Two American Cities, by William Healy and Augusta F. Bronner. 


New York: The Macmillan Company, 1926. Pp. VIII + 317. 
(Judge Baker Foundation P ublication No. 3.) 


In this volume the authors present data which they have been 
17 years collecting, concerning the causes leading to delinquency and 
the factors which make delinquency lead to adult crime. These 
authors are already known for case studies on delinquents and crim- 
inals. They have been favorably located for making their studiesin 
two American cities—Chicago and Boston—and have been admirably 
supported by the Judge Baker Foundation, the Commonwealth Fund, 
and private individuals. The present work differs from their previous 
work in being largely statistical. 

Three groups of juvenile offenders, whose subsequent careers 
have been traced, were used in this study: (1) 920 cases originally 
studied in Chicago between 1909 and 1914 and followed up in 1921-— 
1923; (2) a series of 400 young male offenders who appeared in the 
Boston Juvenile Court in 1909-1914 who were followed up in 1923; 
and (3) a group of 400 boys, originally studied in Boston in 1918- 
1919 and whose subsequent careers have been regularly followed. 

Conclusions as to the cause of crime have been rife. Healy and 
Bronner’s work tends to discredit many of these causes. They 
find no appreciable relationship between delinquency and heredity, 
size of family, physical and mental conditions of the offender, nativity 
of parents, or religious affiliation. That they find no relationship 
between delinquency and physical and mental conditions of the 
offender is especially important. There is a tendency today, among 
psychologists, perhaps because of the development of mental tests 


and the use of the correlation technique, to ascribe behavior phe- 
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nomena to personality differences or disturbances. This use of tests, 
resulting in a description of the personality, has almost obscured the 
differential effect of different environments on persons of quite normal 
characteristics. As a matter of fact Healy and Bronner find that 
factors which are directly causative of delinquency are: bad com- 
panions, adolescerit instability and impulses, early unfortunate sex 
experiences, mental conflicts, school dissatisfaction, poor recreations, 
street life, vocational dissatisfaction, and physical conditions. With 
. regard to motion pictures they say, “Starting with ideas somewhat 
to the contrary, we have been surprised to find that moving picture 
shows seem to have very little effect in the production of delinquent 
tendencies.”’ Healy’s earlier work with mental conflicts leads him to 
believe that a more intimate study of these cases would uncover this 
to be a more predominant cause. My own thinking sees a ‘‘ mental 
conflict”’ as an indication of a type of conduct response to a difficult 
situation—that is the ‘‘mental conflicts” are complementary to the 
bad environmental conditions. 

The authors find a very high amount of later failure among delin- 
quents (failures are defined as individuals having adult court records 
and adjudged guilty as well as those committed to adult correctional 
institutions) amounting to 61 per cent for males (15 per cent being 
professional criminals and 5 per cent having committed homicide) 
and 46 per cent for girls (19 per cent being prostitutes). The data 
indicate that juvenile delinquency leads to adult crime and that much 
of adult crime can be traced back to juvenile delinquency. The 
authors incline to the belief that much of this is preventable. They 
lay the blame largely to the treatment of delinquents by placing them 
in reform schools, and institutional homes. Their data indicate that 
careful placement in private homes and more careful parole works 
toward turning delinquency into a successful after life. 

The statistical treatment is largely by means of percentages. The 
reviewer was many times annoyed in reading by the failure to bring 
out the exact amount of relationship between two variables. Some 
single figure was needed, such as the coefficient of contingency, to 
show whether or not a relationship was slight or high. There seemed 
to be some confusion as to when a relationship was actually demon- 
strated. Sometimes the mere existence of a factor among the delin- 
quent group was taken as an indication of relationship as when love 
of adventure was given as a cause of delinquency because it was 
described as the exciting cause in 2.5 per cent of the cases. On the 
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other hand comparison was made between the incidence of a factor in 
the delinquent group as compared with the general population—as 
when 22 per cent of the delinquents were found with nose or throat 
ailments which was not taken to be a cause of delinquency because this 
percentage is similar to that in the general school population. One 
wonders how many boys who have love of adventure do not become 
delinquent. One suspects that the authors determined the causes 
from their general experience and familiarity with the problem and 
then used the figures to bolster up their beliefs. Differences were 
found between the percentages in Chicago and Boston and much was 
made of these differences. But no attempt was made to show that 
the differences were statistically significant. 

Notwithstanding these defects in the statistical handling of the 
material, there seems much merit in the general conclusions. The 
authors present a program for handling the delinquent problem which 
deserves much careful consideration. PercivaL M. SyMonps. 

Teachers College, Columbia University. 





Lipps’ PRE-GESTALT STUDIES 


Psychological Studies by Theodor Lipps, translated by H. C. Sanborn. 
Baltimore: The Williams and Wilkins Co., 1926. Pp. 333. $6.00. 


Much of the technical literature needed by students of psychology 
as well as by workers in fields related to or largely dependent upon 
psychology is not generally enough available. Some is out of print. 
Some is written in a foreign language not known by the worker. The 
psychology classics—a series of reprints and translations of important 
treatises in psychology of the past—is purposed to help remedy this 
situation and broaden the reading of students. The series is edited 
by Professor Knight Dunlap of Johns Hopkins. The publishers 
of the series have offered to match the royalties from these volumes by 
an equal amount. All to facilitate the preparation of further transla- 
tions and reprints. The studies of Lange and James on ‘‘The Emo- 
tions”’ were selected from VolumeI. The present Psychological Siudies 
by Professor Lipps, translated by a former student, Professor Sanborn, 
are published as Volume II of the series. 

The studies include three monographs which, aside from the fact 
that they are all representative of Professor Lipps’ attitude and point 
of view in psychology, have little relationship, which is relationship 
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enough for the purpose of this series. Of the three, the one that will 
doubtless be of interest to more people from more fields of knowledge 
is the second—‘‘The Nature of Consonance and Dissonance.” This 
will be of special interest to students of psychology of music and 
esthetics. In it are critically reviewed or described the important 
theories of ‘‘tone-rhythms,” the theories of Hemholz, Kruger, Stumpf, 
Wundt; Meyer’s theory of melody; and the author’s own theory which 
accounts for all consonance and dissonance by a reference to the 
_ simpler and less simple vibration ratios between simple tones. 

The other two studies included will be of interest only to serious 
students of psychology and philosophy. The first is called ‘‘The Space 
of Visual Perception.”” In it are discussed such topics as consciousness 
of depth, Lotze’s theory of ideas of movement, Wundt’s nativistic 
theory of another sort and the author’s own theory which, we are 
informed, does justice to the nativistic view because “ with reference to 
the individual the theory is essentially nativistic; by itself, it is, after 
all, thoroughly genetic.’”? Geneticism vs. nativism cannot be consid- 
ered a modern psychological problem. For this reason the whole 
contents of this monograph might superficially be considered as being 
nothing more than a “harvest of leaves.’”’ The same might be said of 
the last section called ‘‘The Law of Psychic Relativity and Weber’s 
Law.” Not very many psychologists are today very much concerned 
about the three immutable laws of Wundt which are just contrary to 
the three fundamental laws that rule the material world. Wundt’s 
treatment of individual consciousness did not permit him to grapple 
with the problems of defective, abnormal and subnormal life. Nor 
are psychologists anymore concerned with Weber’s law. Nor are 
they likely to be with Lipps’ modification described in this section. 
Apparently, then, the two studies contained in the first and last section 
have historical importance only. But as Professor Dunlap notes in 
the preface ‘‘actually the details of mental function organized under 
these headings are of continuing importance: an importance demon- 
strated by the recent rise of interest in the ‘Gestalt’ theory, on which 
Lipps’ discussion bears to a considerable extent, although written 
before the promulgation of that theory.”” The volume contains 
Lipps’ criticisms of Ehrenfel’s concept of Gestaltqualitat, Kruger’s 
concept of Complexqualitat and a statement of his own Gesamtqualitat 
“Such complex qualities or total-qualities exist; z.e., there are qualities 
of a whole which are not qualities of the parts of the whole,” Lipps 
informs us in one place. Statements like these would not be difficult 
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to discover in the writings of Koffka, Kéhler and other leaders of the 

Gestalt movement. They have undoubtedly profited by the writings 

of Professor Lipps. H. MELTzeEr. 
Oregon State College, Corvallis, Oregon. 





ORGANIZED LABOR AND EDUCATION 


Educational Attitudes and Policies of Organized Labor in the United 
States, by Philip R. V. Curoe. New York: Teachers College, 


Columbia University Contributions to Education, No. 210, 1926. 
$1. 50. 


Dr. Curoe’s detailed treatment gives an interesting history of the 
attitudes of organized labor groups to education, considered in the 
narrow sense, as ‘“‘schooling,”’ since 1828. The book is largely con- 
cerned with attitudes and policies toward existent schools, notably 
public education. 

The account readily divides itself into two sections. The first is 
an outline history of the American labor movement, as well as a history 
of educational attitudes, giving the backgrounds out of which pro- 
nouncements and policies grew. It shows labor supporting mainly 
free, compulsory education, general ‘‘education of the public” in 
industrial matters, and educational development of the workers. 

The second and more important section considers the official atti- 
tudes of the longest-lived labor organization, the American Federation 
of Labor. It is apparent in reading the account, that the federation 
has stood with remarkable consistency on the side of greater democ- 
racy and liberalism. The chief break in its record has been its opposi- 
tion to ‘‘capitalistic’ foundations, such as the Carnegie and 
Rockefeller educational funds. The Federation’s opposition to the 
wealth represented has, at times, resulted in blind condemnation of 
expert findings. But, the federation on the whole can present a 
record as progressive, public-minded, and intelligent as any large 
organization, especially since the affiliation of the American Federa- 
tion of Teachers with the labor group. 

It is unfortunate that Mr. Curoe does not give us more material on 
recent labor education. Although he presents a brief review of the 
work under the federation, he does not touch the progress being made 
by such groups as the Amalgamated Clothing Workers, or other labor 
groups, not affiliated with the federation. Nor does he go into the 
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interesting experiments encouraged by labor groups, though not 

officially under their auspices. It would also be interesting to learn 

the reactions of labor to recent controversies in educational theory, 

indoctrination vs. free discovery in education, etc. Labor is certainly 

increasingly alive to such questions. Ws. W. BIDDLE. 
Manumit School, Pawling, N. Y. 





STUDIES OF INDIVIDUAL DIFFERENCES 


Individual Differences in the Intelligence of School Children, by Mary M. 
Wentworth. Harvard University Press, 1926. Pp. 162. $2.00. 


Miss Wentworth in this book describes studies of children, which 
centered around testing and retesting with the Stanford-Binet Tests. 
She includes some unconclusive statistical material, but the major, 
most important, and interesting portion of the report describes in 
detail individual differences in everything but intelligence, and indi- 
cates how these differences affect test scores. 

The 112 case studies reported make the book decidedly worth while, 
and confirmed the reviewer’s opinion that the Stanford-Binet test is 
the most excellent standardized interviewing procedure yet devised. 
For it (the test) gives the interviewer valuable information as to char- 
acter qualities and enables him to make a fairly shrewd guess as to 
the intelligence of the subject. 

Miss Wentworth hardly makes enough of this,—if indeed she 
realizes it. She does not appear to be clear—or to attempt to be clear 
—as to what intelligence is, and what the factors are which must 
always influence its measurement. This possibly accounts for her 
choice of a deceptive title to the study. C. S. SLocoMBE. 

Lincoln School of Teachers College, New York. 





RECENT WoRK IN ARITHMETIC 


Modern Methods of Teaching Arithmetic, by Ralph S. Newcomb. 
Boston: Houghton Mifflin Co. 1926. Pp. XV + 345. $2.00. 


In this book the author has attempted to meet the demands of 
the modern conception of the purposes of instruction in arithmetic 
and to adapt methods and devices in accordance with general peda- 
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gogical and psychological principles accepted today. The field covered 
by the author includes not only the customary topics such as the funda- 
mental operations, common and decimal fractions, percentage, drill 
and problem-solving, but also a study of tables, statistics, graphs, 
algebra and geometry suitable for the elementary school as well as 
some discussion of the curriculum in arithmetic emphasizing socializa- 
tion and correlation. Constant reference has been made to the field 
of educational psychology for the rejection ot substantiation of the 
methods discussed. However, throughout the entire book the reviewer 
finds only two references to publications later than 1924 and suggests 
that some use might well have been made of some of the valuable 
contributions of the last two years. The book is set up after the 
manner of a textbook with discussions under topical headings and 
questions at the end of each chapter so is very well adapted for class- 
room use in teachers colleges and normal schools. 


An Arithmetic for Teachers, by William F. Roantree and Mary 8. 
Taylor. New York: The Memillan Co., 1925. Pp. XIII + 621. 


In An Arithmetic for Teachers the authors have combined material 
that will provide for the teachers thorough knowledge of the subject, 
with material and discussions of the methodology of teaching arith- 
metic. They have been guided by a belief that a teacher needs a good 
foundation in subject-matter along with methods of presentation and 
so have provided for both by dividing each chapter into two parts: 
teachers’ knowledge, and methods of teaching. They have devoted 
more space to the problems of economics and business than is usual, 
putting considerable emphasis on interest, commercial paper, insurance, 
stocks, bonds, building loans, and taxation. This book will not only 
be useful for classes in schools of education but a valuable addition to 
the library of every teacher in service who has anything to do with the 
supervision or teaching of arithmetic. 


What Arithmetic Shall We Teach? by Guy M. Wilson. Boston: 
Houghton Mifflin Co., 1926. Pp. VIII + 149. 


Mr. Wilson would approach the problem of curriculum construc- 
tion from an analysis of life’s needs and in this book he summarizes 
the conclusions at which he has arrived from his own studies and a 
survey of the related investigations of others. His recent studies bear 
out with only the very slightest variations the conclusions drawn from 
his former study, A Survey of the Social and Business Usage of Arith- 
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metic, 1918. The writer studied the topics of arithmetic and their use 
in various vocational fields, interpreting the results by applying the 
criteria of cruciality, frequency, adaptability, and relative values. 
He couches his conclusions in the two last chapters in which he sum- 
marizes the processes and the degree of difficulty for the grades, and 
sets up the new course of study. His data are undoubtedly valid, his 
conclusions apparently sound, and the book a real contribution to the 
science of curriculum-making. 


' Teaching Number Fundamentals, by Milo B. Hillegas. Philadelphia: 
J. B. Lippincott Co., 1925. Pp. 98. 


Teaching Number Fundamentals is a teacher’s manual which accom- 
panies the Horace Mann Supplementary Arithmetic, to give directions 
for the use of the drill book and to explain the principles upon which it 
was developed. The manual is divided into three parts; the first part 
includes a brief discussion of the need for drill material and the salient 
features of this particular drill material; the second part is given over 
entirely to directions for the teacher s’guidance in using the material; 
the third part contains an analysis of the exercises in the drill book. 


Horace Mann Supplementary Arithmetic, by Milo B. Hillegas, Gertrude 
Peabody and Ida M. Baker. Philadelphia: J. B. Lippincott Co., 
1925. Pp. 156. 


Horace Mann Supplementary Arithmetic (diagnostic and remedial) 
provides practice material for addition, subtraction, multiplication, 
long and short division of integers. Each process is divided into three 
parts. The first part contains the basic facts, the second and third 
parts contain the steps of the processes, graded in difficulty. For each 
step in the sequence two equivalent sets of examples are given. It 
provides good practice, very well graded but can scarcely be con- 
sidered diagnostic inasmuch as it tells only what the child can not do, 
and not why he is unable to do it. 


Social Arithmetic, Book One, by Frank M. McMurray and C. Beverly 
Benson. New York: The Macmillan Co., Pp. V + 345. 

The authors have written this book with two apparent principles 
in view: (1) that it was not only for children but to children, (2) that it 
should utilize all such available information as could possibly be adapted 
to the field of arithmetic. They have no concern about the teacher 
as is evidenced by the fact that there is neither an author’s nor an 
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editor’s preface, the only introduction being a few pages addressed 
“to the boys and girls who have this book.” It sets up many situa- 
tions involving the fundamental processes through long division and 
common fractions, and provides some material which is pure drill. 
The chapter titles are significant enough to merit the mention of a few 
of them here: What Tom Learned about Buying Groceries, Important 
Facts about Numbers on the Farm, Where People Live and Work in 
Large Cities. Despite the remarkable ingenuity displayed by the 
authors in finding and using actual life situations one can not but feel 
that they have strained a point occasionally. The reviewer thinks it 
might be an interesting method of teaching arithmetic if not an 
economical one. JOSEPHINE M. HALey. 
Lincoln School of Teachers College. 





ASSOCIATION AS AN INDEX OF INTELLIGENCE 


The Relation between Association and the Higher Mental Processes, by 
J. W. Tilton. Bureau of Publications, Teachers College, Columbia 
University, 1926. Pp. VIII + 55. $1.50. 


Current psychological literature abounds with theories and opinions 
concerning the nature of mental organization. There is no one con- 
ception which is generally accepted. Professor Thorndike has one 
theory; Professor Spearman, another; Gestalt psychologists, others. 
This study is purposed to throw light on one important phase of the 
problem; namely, the relation of the number of associations and the 
higher mental processes. 

The literature of the subject is briefly surveyed in the first two 
chapters; the literature of opinion in Chapter I, the literature of fact 
in Chapter II. In the remaining six short chapters are reported the 
procedure and findings of the present statistical investigation. 

The results of his findings the author summarizes for us as follows: 
“‘Scores in the association tests (1) were found to correlate as highly 
with those in the higher mental process tests as one higher mental 
process test did with another; (2) were as highly correlated with a 
composite score as were the scores in higher mental process tests; 
(3) were as highly correlated with school success (slightly higher in the 
case of arithmetic) as determined by elementary grades, high school 
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grades, and school progress; (4) were as highly correlated with a cri- 
terion of intelligence composed of the composite test score and school 
success.” 

These results the author judiciously warns us are submitted not as 
proof of causal relationship but as facts. But these facts clearly show 
that the number of correct associations is, at least in the case of the 
pupils tested, as good an index of intelligence as is his ability to 
respond adequately to novel situations. Which is another way of 
saying that generally the most informed person thinks best. What is 
known—“‘association”’—as measured by “habit tests” such as vocab- 
ulary, information and arithmetic seems to serve as “drives,” in the 
Woodworth sense, to adaptation to novel situations as measured by 
‘“‘nower tests” such as completion, analogies, arithmetic completions, 
and arithmetic equations. 

This study impresses the reviewer as being a well-planned investi- 
gation; carefully worked out and cautiously interpreted. Like many 
other Ph.D. dissertations it is not so well written. The results seem to 
validate the opinion of Thorndike that ‘“‘the mind is ruled by habit 
throughout.” Will a finer differentiation of quality of mental ability, 
say such as that suggested by Professor Ogden in his new book, 
Psychology and Education, yield the same results? If mental tests 
are selected which would follow the four main types of learning as 
described by Ogden and other Gestalt psychologists; namely, (1) 
differentiation, (2) assimilation, (3) gradation, (4) re-definition— 
would the results be identical with, similar to, or different from the 
results reported by Dr. Tilton? H. MELTZER. 

Oregon State College, Corvallis, Oregon. 





CHOOSING THE Brest ExisTING CourssEs or STuDY 


Rating Elementary School Courses of Study, by Florence B. Strate- 
meyer and Herbert B. Bruner. New York: Bureau of Publica- 
tions, Teachers College, Columbia University, 1926. Pp. 193. 


The present task of the Bureau of Curriculum Research is based on 
the assumption that the new curriculum should be sufficiently near to 
present practice, to be understood and applied by the schools of our 
nation. Therefore, it undertook to analyze existing courses of study 
in order to discover several of the best ones in each of the formal school 
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subjects. In this way the Bureau hopes to bring curricula up to the 
level of the best present practice. 

Nine thousand courses of study for the kindergarten and first six 
grades were assembled. Research students determined the criteria 
to be used as a basis for evaluating the courses of study by listing the 
points of strength or weakness of 498 courses of study. The list yielded 
886 points of strength and 827 points of weakness. Standards for 
measuring textbooks were also used as a source of criteria. For 
each subject there were criteria which represent diverse and opposing 
points of view. 

The criteria are concerned with the need for stating objectives; 
the validity of the objectives; the nature and mode of determining 
subject-matter; the adaptation of the curriculum material to the needs 
of the pupil; the adaptation of the course of study to the teacher’s 
needs; the clearness, attractiveness, and typography of the course of 
study. Thecriteria emphasize the importance of stating the objectives 
specifically for each grade; of organizing the content by activities; of 
organizing the materials in terms of pupil differences; of including 
illustrative lessons; of including standards for judging instruction; 
of including directions for the use of materials; of including carefully 
worked bibliographies for teachers and pupils. 

The judges were 121 Teachers College research students recom- 
mended by members of the faculty. All the courses of study in one 
subject were rated by at least three judges. In cases of disagreement 
the courses were submitted to two additional judges. In other words, 
the final ranking of the courses of study in arithmetic, for example, 
was the result of the judgment of not more than five persons. 

We have here the first attempt at a kind of a score card for evalu- 
ating courses of study in any field. Each set of criteria or score card, 
as it were, suggests some of the good qualities a curriculum-making 
group should introduce into a course of study. Since opposing points 
of view are included among the criteria we are given several of the 
fundamental issues in each of the formal subjects which are helpful in 
defining the position of a curriculum-making group. Furthermore, 
we have an inventory of some of the more important qualities which a 
course of study should possess. We have not only a score card, but also 
an application of it to about 9000 courses of study representing prac- 
tically the complete output of the nation. The outcome of the rating 
is the discovery of about 200 of the best courses of study in the country. 
The publication of this list of superior courses of study should serve 
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to set up a goal toward which school systems may strive. Further- 
more, it should serve to awaken certain school systems, whose courses 
of study are not included, out of their complacency. 

The procedure of the Bureau, however, has certain weaknesses. 
There is no assurance that the best present practice meets the real 
needs of child and adult life. It is not safe to intrench the position of 
present practice by the sanctions of research. In discovering the 
criteria of good courses of study it matters little whether opposing 
points of view are represented but it matters much that the right point 
of view is represented. Indeed, when opposing points of view are 
included, it is very likely that some very bad ones may creep in. It is 
difficult to see how one may arrive at a measure of value when a cri- 
terion and its opposite have the same worth. We are not told how 
the tentative criteria in each of the subjects were built from the points 
of strength and from the standards of measuring textbooks. 

In spite of the agreement of three judges on the ranking of several 
hundred courses of study, the number of judges is too few to give their 
results sufficient reliability. In spite of the evidence concerning the 
qualifications of the judges there is not adequate proof to assure their 
competency. 

As a check on the selection of the best arithmetic courses, we are 
told of the courses considered best by 31 outside specialists only two 
did not appear on the list built by the judges. However, we are not 
told how many courses in the judges’ list would have been excluded by 
the specialists. In other words, we are not given any exact informa- 
tion on the agreement between specialists and the judges. 

The chief difficulty with the study is its attempt to cover too much 
ground. Consequently, it lacks in thoroughness what it gains in 
completeness. The results, therefore, are a rough selection of the 
best courses of study; the determination of certain criteria of good 
courses of study; and a tentative measuring instrument for the evalua- 
tion of curricula. Henry Harap. 

Cleveland School of Education. 
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IMPROVING CHILDREN’s THINKING THROUGH METHODS OF StTupDy 


Children’s Thinking, A Study of the Thinking Done by a Group of 
Grade Children When Encouraged to Ask Questions about 
United States History, by Inga Olga Helseth. New York: 
Teachers College, Columbia University Contributions to Educa- 
tion No. 209, 1926. Pp. V+ 163. $1.50. 


Seek not here for a rigorous analysis of the thought processes 
of children! Rather, come expecting a description of some good 
teaching with an analysis of wherein the class work at the end of the 
year seemed to be getting along better than it did at first. Come to 
note the way in which verbatim reports may be dissected, arranged, 
sampled, viewed from various angles, until steps of development stand 
out clearly. 

Of unusual value are the 81 pages of appendix, reporting in detail 
many of the class sessions, in which these 16 children in seventh and 
eighth grade were led to ask questions about United States history, 
and to make their own plans for answering the proposed questions. 
The author finds that from September to May the amount of talking 
by the teacher was gradually reduced from 60 per cent to 14 per cent 
of the total number of words. Pupils were nonplussed by the demand 
for their own questions, in the fall, but by spring questions were not 
only numerous but appeared to be of a sort requiring a high grade of 
organization. One hundred and fifty students showed a large measure 
of accord with the correct order, when they attempted to tell, simply 
by analysis of samples from the viewpoint of skill in thinking, which 
samples came from the first quarter (91 per cent correct), which from 
the second quarter (59 per cent correct), from the third quarter (66 
per cent), and which belonged in the last quarter (84 per cent correct) 
of the year’s work. 

It appears that so much attention to methods of study and proc- 
esses of going about work, did not detract from the amount of 
information acquired. In most of the standard history tests the pupils 
at the end of the year excelled the norms for the country. Unfortun- 
ately the content tests were not given at the beginning of the work, so 
it is impossible to know the exact improvement during the year. At 
certain other points, however, such as the ability to find facts in a 
text readily, to state quickly, the question a paragraph is answering, 
and to make suggestions about how to study a chapter, the class 
showed large, clear gain. 
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” Since no control group was used the author feels limited to the 
general conclusions that children in seventh and eighth grade are 
capable of asking and answering for themselves good questions about 
history, that improvement in skill in thinking about such questions 
is possible, that attention to method of study is desirable, and that 
certain other values are retained even though the attention be chiefly 
on method of study. In other words, this method of study in which 
the pupils under careful guidance propose and investigate their own 
problems does very well. 

The conclusions are a bit obvious, but it is well to have them so 
firmly secured, and particularly interesting to observe the techniques 
for analysis of the records of a process, which the author has evolved. 

Teachers College, Columbia University Goopwin B. Watson. 





ApsJustTiInc HicuH Scuoou PupPits 


Pupil Adjustment in Junior and Senior High Schools, by W. C. Reavis. 
New York: D. C. Heath and Co., 1926. Pp. XVIII + 348. 


The discovery of the capacities and interests of pupils and the 
adaptation of the curriculum and instruction to the peculiar needs of 
each are accepted as a responsibility of the high school. Exploration 
is commonly spoken of as the peculiar function of the junior high school. 
The question may fairly be raised, however, whether this new form of 
organization had justified itself through better adjustment of the work 
of the school to the needs of individual pupils. 

From reading the literature dealing with testing and guidance 
and from observation of school practice, one must conclude that 
for the most part pupil adjustment extends little beyond the classifica- 
tion of pupils into groups of approximately equal ability on the basis of 
rather crude tests. This doubtless represents an advance on former 
procedure, but for the exceptionally bright or dull and for the mal- 
adjusted child at any level of ability the school usually falls far short 
of any adequate attempt at adjustment. For the more careful 
observation, testing, and guidance of the exceptional child, the school 
has neither the trained personnel nor a program of procedure. 

Principal Reavis, of the University of Chicago High School, in 
his Pupil Adjustment n Juntor and Senior High Schools, has presented 
the need of the maladjusted pupil for counseling and guidance, with a 
working program for meeting this need through the case method. 
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The chief merit of the book is that the author is not content with 
setting up a theoretical program. Indeed the greater part of the book 
is devoted to the specific and detailed report of nine typical cases. 
The scope of the treatment may be best shown by giving the descrip- 
tive designations of these cases: a case of social maladjustment; of 
physical and health disabilities; of endocrine deficiency; of unfavorable 
mental predilection; of deficient previous school training; of speech 
disability and emotional complications; of personality maladjustment; 
of ineffective habits of work and study; of psycho-physical defects. 

This book will serve a useful purpose in pointing out clearly to the 
busy principal a need of which he must be already aware and a method 
which has worked in a well-known school. There is danger that some 
who read the book will find it convincing but will nevertheless conclude 
that because of the lack of trained personnel workers necessary to 
its operation the method cannot be carried out in their schools. 
Although few schools present so favorable conditions for such a pro- 
gram of adjustment as that over which the author presides, any teacher 
or principal who reads the book will find in it much that will be sug- 
gestive and altogether possible in dealing with pupils under his charge. 


FRANKLIN W. JOHNSON. 
Teachers College, Columbia University. 
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